Converted Notebook

Converted from IPYNB format

code cell
In [1]:
library(ArchR) library(parallel) library(ggplot2) library(dplyr)
Output:
/ | / \ . / |. \\\ / |. \\\ / `|. \\\ / |. \ / |\ \\#####\ / || ==###########> / || \\##==......\ / || ______ = =|__ /__ || \\\ ,--' ,----`-,__ ___/' --,-`-===================##========> \ ' ##_______ _____ ,--,__,=##,__ /// , __== ___,-,__,--'#' ===' `-' | ##,-/ -,____,---' \\####\\________________,--\\_##,/ ___ .______ ______ __ __ .______ / \ | _ \ / || | | | | _ \ / ^ \ | |_) | | ,----'| |__| | | |_) | / /_\ \ | / | | | __ | | / / _____ \ | |\ \\___ | `----.| | | | | |\ \\___. /__/ \__\ | _| `._____| \______||__| |__| | _| `._____| ArchR : Version 1.0.3 For more information see our website : www.ArchRProject.com If you encounter a bug please report : https://github.com/GreenleafLab/ArchR/issues Loading Required Packages... Loading Package : devtools v2.4.5 Warning message: “package ‘devtools’ was built under R version 4.3.3” Warning message: “package ‘usethis’ was built under R version 4.3.3” Loading Package : grid v4.3.2 Loading Package : gridExtra v2.3 Loading Package : gtools v3.9.5 Loading Package : gtable v0.3.5 Warning message: “package ‘gtable’ was built under R version 4.3.3” Loading Package : ggplot2 v3.5.1 Warning message: “package ‘ggplot2’ was built under R version 4.3.3” Loading Package : magrittr v2.0.3 Loading Package : plyr v1.8.9 Loading Package : stringr v1.5.1 Loading Package : data.table v1.17.8 Warning message: “package ‘data.table’ was built under R version 4.3.3” Loading Package : matrixStats v1.5.0 Loading Package : sparseMatrixStats v1.14.0 Warning message: “package ‘sparseMatrixStats’ was built under R version 4.3.3” Warning message: “package ‘MatrixGenerics’ was built under R version 4.3.3” Loading Package : S4Vectors v0.40.2 Warning message: “package ‘S4Vectors’ was built under R version 4.3.3” Loading Package : GenomicRanges v1.54.1 Warning message: “package ‘IRanges’ was built under R version 4.3.3” Loading Package : BiocGenerics v0.48.1 Loading Package : Matrix v1.6.3 Loading Package : Rcpp v1.1.0 Loading Package : RcppArmadillo v0.12.8.2.1 Warning message: “package ‘RcppArmadillo’ was built under R version 4.3.3” Loading Package : SummarizedExperiment v1.32.0 Loading Package : rhdf5 v2.46.1 Warning message: “package ‘rhdf5’ was built under R version 4.3.3” Attaching package: ‘dplyr’ The following object is masked from ‘package:Biobase’: combine The following objects are masked from ‘package:GenomicRanges’: intersect, setdiff, union The following object is masked from ‘package:GenomeInfoDb’: intersect The following objects are masked from ‘package:IRanges’: collapse, desc, intersect, setdiff, slice, union The following objects are masked from ‘package:S4Vectors’: first, intersect, rename, setdiff, setequal, union The following objects are masked from ‘package:BiocGenerics’: combine, intersect, setdiff, union The following object is masked from ‘package:matrixStats’: count The following objects are masked from ‘package:data.table’: between, first, last The following objects are masked from ‘package:plyr’: arrange, count, desc, failwith, id, mutate, rename, summarise, summarize The following object is masked from ‘package:gridExtra’: combine The following objects are masked from ‘package:stats’: filter, lag The following objects are masked from ‘package:base’: intersect, setdiff, setequal, union
markdown cell

1. Create Arrow Files

code cell
In [2]:
setwd('/share2/pub/zhouyj/zhouyj/Human_eye_atlas/analysis/demo')
code cell
In [3]:
inputFiles_example <- c("/share2/pub/zhouyj/zhouyj/Human_eye_atlas/GSE228370/SRR23990988/outs/fragments.tsv.gz", "/share2/pub/zhouyj/zhouyj/Human_eye_atlas/GSE228370/SRR23990989/outs/fragments.tsv.gz") names(inputFiles_example) <- c("fetal_eye_except_retina","fetal_retina")
code cell
In [4]:
addArchRGenome("hg38") ## Setting default genome to Hg38. addArchRThreads(threads = 16) ## Setting default number of Parallel threads to 16.
Output:
Setting default genome to Hg38. Setting default number of Parallel threads to 16.
code cell
In [6]:
ArrowFiles <- createArrowFiles( inputFiles = inputFiles_example, sampleNames = names(inputFiles_example), minTSS = 4, #Dont set this too high because you can always increase later minFrags = 1000, addTileMat = TRUE, addGeneScoreMat = TRUE )
Output:
Using GeneAnnotation set by addArchRGenome(Hg38)! Using GeneAnnotation set by addArchRGenome(Hg38)! Warning message: “package ‘XVector’ was built under R version 4.3.3” Warning message: “package ‘rtracklayer’ was built under R version 4.3.3” ArchR logging to : ArchRLogs/ArchR-createArrows-6f9e0486afd59-Date-2025-10-29_Time-15-08-19.252704.log If there is an issue, please report to github with logFile! Cleaning Temporary Files subThreading Disabled since ArchRLocking is TRUE see `addArchRLocking` 2025-10-29 15:08:19.788029 : Batch Execution w/ safelapply!, 0 mins elapsed. ArchR logging successful to : ArchRLogs/ArchR-createArrows-6f9e0486afd59-Date-2025-10-29_Time-15-08-19.252704.log
markdown cell

2. Doublet inference

code cell
In [7]:
doubScores <- addDoubletScores( input = ArrowFiles, k = 10, #Refers to how many cells near a "pseudo-doublet" to count. knnMethod = "UMAP", #Refers to the embedding to use for nearest neighbor search with doublet projection. LSIMethod = 1 )
Output:
ArchR logging to : ArchRLogs/ArchR-addDoubletScores-6f9e033bc5927-Date-2025-10-29_Time-15-23-15.87131.log If there is an issue, please report to github with logFile! 2025-10-29 15:23:15.969165 : Batch Execution w/ safelapply!, 0 mins elapsed. 2025-10-29 15:23:15.978678 : fetal_eye_except_retina (1 of 2) : Computing Doublet Statistics, 0 mins elapsed. Warning message: “package ‘sp’ was built under R version 4.3.3” Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics' Also defined by ‘spam’ Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics' Also defined by ‘spam’ fetal_eye_except_retina (1 of 2) : UMAP Projection R^2 = 0.97113 fetal_eye_except_retina (1 of 2) : UMAP Projection R^2 = 0.97113 2025-10-29 15:26:23.22016 : fetal_retina (2 of 2) : Computing Doublet Statistics, 3.121 mins elapsed. Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics' Also defined by ‘spam’ Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics' Also defined by ‘spam’ fetal_retina (2 of 2) : UMAP Projection R^2 = 0.88655 fetal_retina (2 of 2) : UMAP Projection R^2 = 0.88655 ArchR logging successful to : ArchRLogs/ArchR-addDoubletScores-6f9e033bc5927-Date-2025-10-29_Time-15-23-15.87131.log
markdown cell

3. Creating An ArchRProject

code cell
In [8]:
projHeme1 <- ArchRProject( ArrowFiles = ArrowFiles, outputDirectory = "GSEGSE228370_26_post-conceptional_weeks", copyArrows = TRUE )
Output:
Using GeneAnnotation set by addArchRGenome(Hg38)! Using GeneAnnotation set by addArchRGenome(Hg38)! Validating Arrows... Getting SampleNames... Copying ArrowFiles to Ouptut Directory! If you want to save disk space set copyArrows = FALSE 1 2 Getting Cell Metadata... Merging Cell Metadata... Initializing ArchRProject... / | / \ . / |. \\\ / |. \\\ / `|. \\\ / |. \ / |\ \\#####\ / || ==###########> / || \\##==......\ / || ______ = =|__ /__ || \\\ ,--' ,----`-,__ ___/' --,-`-===================##========> \ ' ##_______ _____ ,--,__,=##,__ /// , __== ___,-,__,--'#' ===' `-' | ##,-/ -,____,---' \\####\\________________,--\\_##,/ ___ .______ ______ __ __ .______ / \ | _ \ / || | | | | _ \ / ^ \ | |_) | | ,----'| |__| | | |_) | / /_\ \ | / | | | __ | | / / _____ \ | |\ \\___ | `----.| | | | | |\ \\___. /__/ \__\ | _| `._____| \______||__| |__| | _| `._____|
code cell
In [9]:
head(projHeme1@cellColData)
Result:
DataFrame with 6 rows and 15 columns Sample TSSEnrichment <Rle> <array> fetal_eye_except_retina#ACAGAAAAGGTTCGTT-1 fetal_eye_except_ret.. 20.692 fetal_eye_except_retina#TAAGTGCTCACCACAA-1 fetal_eye_except_ret.. 14.577 fetal_eye_except_retina#AATGTCGGTTCCAATG-1 fetal_eye_except_ret.. 24.611 fetal_eye_except_retina#GATGGCCTCCTGTAGA-1 fetal_eye_except_ret.. 22.153 fetal_eye_except_retina#GCTGCGACAGCGTGAA-1 fetal_eye_except_ret.. 13.669 fetal_eye_except_retina#ACTATTCCATTAGCCA-1 fetal_eye_except_ret.. 21.086 ReadsInTSS ReadsInPromoter <array> <array> fetal_eye_except_retina#ACAGAAAAGGTTCGTT-1 47891 43771 fetal_eye_except_retina#TAAGTGCTCACCACAA-1 21768 22048 fetal_eye_except_retina#AATGTCGGTTCCAATG-1 66295 58896 fetal_eye_except_retina#GATGGCCTCCTGTAGA-1 49281 45561 fetal_eye_except_retina#GCTGCGACAGCGTGAA-1 21123 20829 fetal_eye_except_retina#ACTATTCCATTAGCCA-1 45575 40526 ReadsInBlacklist PromoterRatio <array> <array> fetal_eye_except_retina#ACAGAAAAGGTTCGTT-1 1627 0.227511824938926 fetal_eye_except_retina#TAAGTGCTCACCACAA-1 1700 0.115316220004603 fetal_eye_except_retina#AATGTCGGTTCCAATG-1 1484 0.312740943702807 fetal_eye_except_retina#GATGGCCTCCTGTAGA-1 1395 0.262448156682028 fetal_eye_except_retina#GCTGCGACAGCGTGAA-1 1440 0.130371918931437 fetal_eye_except_retina#ACTATTCCATTAGCCA-1 1365 0.259952019910454 PassQC NucleosomeRatio <array> <array> fetal_eye_except_retina#ACAGAAAAGGTTCGTT-1 1 0.314821902080315 fetal_eye_except_retina#TAAGTGCTCACCACAA-1 1 0.593220338983051 fetal_eye_except_retina#AATGTCGGTTCCAATG-1 1 0.409870184317307 fetal_eye_except_retina#GATGGCCTCCTGTAGA-1 1 0.356758784544204 fetal_eye_except_retina#GCTGCGACAGCGTGAA-1 1 0.492164004856636 fetal_eye_except_retina#ACTATTCCATTAGCCA-1 1 0.506639349015212 nMultiFrags nMonoFrags nFrags <array> <array> <array> fetal_eye_except_retina#ACAGAAAAGGTTCGTT-1 8751 73162 96195 fetal_eye_except_retina#TAAGTGCTCACCACAA-1 5815 60003 95598 fetal_eye_except_retina#AATGTCGGTTCCAATG-1 6227 66787 94161 fetal_eye_except_retina#GATGGCCTCCTGTAGA-1 10097 63976 86800 fetal_eye_except_retina#GCTGCGACAGCGTGAA-1 11363 53535 79883 fetal_eye_except_retina#ACTATTCCATTAGCCA-1 5615 51737 77949 nDiFrags DoubletScore <array> <array> fetal_eye_except_retina#ACAGAAAAGGTTCGTT-1 14282 32.3282521623482 fetal_eye_except_retina#TAAGTGCTCACCACAA-1 29780 20.4734703975564 fetal_eye_except_retina#AATGTCGGTTCCAATG-1 21147 10.1135106689874 fetal_eye_except_retina#GATGGCCTCCTGTAGA-1 12727 2.4897703541641 fetal_eye_except_retina#GCTGCGACAGCGTGAA-1 14985 9.02590977818845 fetal_eye_except_retina#ACTATTCCATTAGCCA-1 20597 0.330631173274085 DoubletEnrichment <array> fetal_eye_except_retina#ACAGAAAAGGTTCGTT-1 3.36 fetal_eye_except_retina#TAAGTGCTCACCACAA-1 2.84 fetal_eye_except_retina#AATGTCGGTTCCAATG-1 2.32 fetal_eye_except_retina#GATGGCCTCCTGTAGA-1 1.8 fetal_eye_except_retina#GCTGCGACAGCGTGAA-1 2.24 fetal_eye_except_retina#ACTATTCCATTAGCCA-1 1.6 BlacklistRatio <array> fetal_eye_except_retina#ACAGAAAAGGTTCGTT-1 0.00845678049794688 fetal_eye_except_retina#TAAGTGCTCACCACAA-1 0.00889139940166112 fetal_eye_except_retina#AATGTCGGTTCCAATG-1 0.00788012021962384 fetal_eye_except_retina#GATGGCCTCCTGTAGA-1 0.00803571428571428 fetal_eye_except_retina#GCTGCGACAGCGTGAA-1 0.00901318177835084 fetal_eye_except_retina#ACTATTCCATTAGCCA-1 0.00875572489704807
code cell
In [10]:
df <- getCellColData(projHeme1, select = c("log10(nFrags)", "TSSEnrichment"))
code cell
In [11]:
p <- ggPoint( x = df[,1], y = df[,2], colorDensity = TRUE, continuousSet = "sambaNight", xlabel = "Log10 Unique Fragments", ylabel = "TSS Enrichment", xlim = c(log10(500), quantile(df[,1], probs = 0.99)), ylim = c(0, quantile(df[,2], probs = 0.99)) ) + geom_hline(yintercept = 4, lty = "dashed") + geom_vline(xintercept = 3, lty = "dashed") p
Image output
code cell
In [12]:
p1 <- plotGroups( ArchRProj = projHeme1, groupBy = "Sample", colorBy = "cellColData", name = "TSSEnrichment", plotAs = "ridges", baseSize = 10 ) p1
Output:
1 Picking joint bandwidth of 1.1
Image output
code cell
In [13]:
p2 <- plotGroups( ArchRProj = projHeme1, groupBy = "Sample", colorBy = "cellColData", name = "TSSEnrichment", plotAs = "violin", alpha = 0.4, baseSize = 10, addBoxPlot = TRUE, ) p2
Output:
1
Image output
code cell
In [14]:
p3 <- plotFragmentSizes(ArchRProj = projHeme1) p3
Output:
ArchR logging to : ArchRLogs/ArchR-plotFragmentSizes-6f9e065d2ae8c-Date-2025-10-29_Time-15-39-03.720819.log If there is an issue, please report to github with logFile! ArchR logging successful to : ArchRLogs/ArchR-plotFragmentSizes-6f9e065d2ae8c-Date-2025-10-29_Time-15-39-03.720819.log
Image output
code cell
In [15]:
p4 <- plotTSSEnrichment(ArchRProj = projHeme1) p4
Output:
ArchR logging to : ArchRLogs/ArchR-plotTSSEnrichment-6f9e06e390d10-Date-2025-10-29_Time-15-39-35.463365.log If there is an issue, please report to github with logFile! subThreading Disabled since ArchRLocking is TRUE see `addArchRLocking` 2025-10-29 15:39:36.048239 : fetal_eye_except_retina Computing TSS (1 of 2)!, 0.01 mins elapsed. 2025-10-29 15:40:38.608483 : fetal_eye_except_retina Finished Computing TSS (1 of 2)!, 1.052 mins elapsed. 2025-10-29 15:40:38.65165 : fetal_retina Computing TSS (2 of 2)!, 1.053 mins elapsed. 2025-10-29 15:41:08.583292 : fetal_retina Finished Computing TSS (2 of 2)!, 1.552 mins elapsed. ArchR logging successful to : ArchRLogs/ArchR-plotTSSEnrichment-6f9e06e390d10-Date-2025-10-29_Time-15-39-35.463365.log
Image output
code cell
In [16]:
projHeme1 <- saveArchRProject(ArchRProj = projHeme1, outputDirectory = "Save-ProjHeme1", load = TRUE)
Output:
Copying ArchRProject to new outputDirectory : /share2/pub/zhouyj/zhouyj/Human_eye_atlas/analysis/demo/Save-ProjHeme1 Copying Arrow Files... Copying Arrow Files (1 of 2) Copying Arrow Files (2 of 2) Getting ImputeWeights No imputeWeights found, returning NULL Copying Other Files... Saving ArchRProject... Loading ArchRProject... Successfully loaded ArchRProject! / | / \ . / |. \\\ / |. \\\ / `|. \\\ / |. \ / |\ \\#####\ / || ==###########> / || \\##==......\ / || ______ = =|__ /__ || \\\ ,--' ,----`-,__ ___/' --,-`-===================##========> \ ' ##_______ _____ ,--,__,=##,__ /// , __== ___,-,__,--'#' ===' `-' | ##,-/ -,____,---' \\####\\________________,--\\_##,/ ___ .______ ______ __ __ .______ / \ | _ \ / || | | | | _ \ / ^ \ | |_) | | ,----'| |__| | | |_) | / /_\ \ | / | | | __ | | / / _____ \ | |\ \\___ | `----.| | | | | |\ \\___. /__/ \__\ | _| `._____| \______||__| |__| | _| `._____|
code cell
In [17]:
projHeme2 <- filterDoublets(projHeme1)
Output:
Filtering 297 cells from ArchRProject! fetal_eye_except_retina : 285 of 5346 (5.3%) fetal_retina : 12 of 1136 (1.1%)
code cell
rm(projHeme1) gc()
markdown cell

4. Dimensionality Reduction

code cell
In [27]:
projHeme2 <- addIterativeLSI( ArchRProj = projHeme2, useMatrix = "TileMatrix", name = "IterativeLSI", iterations = 2, clusterParams = list( #See Seurat::FindClusters resolution = c(0.2), #or c(0.2,0.4,0.6) sampleCells = 10000, n.start = 10 ), varFeatures = 25000, dimsToUse = 1:30 )
Output:
Checking Inputs... ArchR logging to : ArchRLogs/ArchR-addIterativeLSI-6f9e030c91e29-Date-2025-10-29_Time-19-37-27.020365.log If there is an issue, please report to github with logFile!
code cell
In [6]:
#Batch effect correction projHeme2 <- addHarmony( ArchRProj = projHeme2, reducedDims = "IterativeLSI", name = "Harmony", groupBy = "Sample" )
Output:
Warning message: “package ‘harmony’ was built under R version 4.3.3” Transposing data matrix Initializing state using k-means centroids initialization Harmony 1/10 Harmony 2/10 Harmony 3/10 Harmony converged after 3 iterations
code cell
In [7]:
packageVersion('harmony')
Result:
[1] ‘1.2.1’
markdown cell

5. Clustering using Seurat

code cell
In [8]:
projHeme2 <- addClusters( input = projHeme2, reducedDims = "IterativeLSI", method = "Seurat", name = "Clusters", resolution = 0.8 )
Output:
ArchR logging to : ArchRLogs/ArchR-addClusters-91a87078eb9c-Date-2025-10-30_Time-09-28-26.056578.log If there is an issue, please report to github with logFile! Warning message: “package ‘sp’ was built under R version 4.3.3” 2025-10-30 09:28:38.222482 : Running Seurats FindClusters (Stuart et al. Cell 2019), 0.191 mins elapsed. Computing nearest neighbor graph Computing SNN
Output:
Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck Number of nodes: 6185 Number of edges: 248893 Running Louvain algorithm... Maximum modularity in 10 random starts: 0.8538 Number of communities: 15 Elapsed time: 0 seconds
Output:
2025-10-30 09:28:44.91351 : Testing Outlier Clusters, 0.303 mins elapsed. 2025-10-30 09:28:44.985308 : Assigning Cluster Names to 15 Clusters, 0.304 mins elapsed. 2025-10-30 09:28:45.08451 : Finished addClusters, 0.306 mins elapsed.
code cell
In [9]:
cM <- confusionMatrix(paste0(projHeme2$Clusters), paste0(projHeme2$Sample)) cM
Result:
15 x 2 sparse Matrix of class "dgCMatrix" fetal_eye_except_retina fetal_retina C3 85 184 C1 330 . C12 459 24 C5 247 34 C10 978 9 C8 85 1 C11 1668 5 C4 103 388 C6 117 136 C7 301 5 C14 223 69 C13 58 55 C15 223 155 C2 12 57 C9 172 2
code cell
In [10]:
#plot confusion matrix as a heatmap library(pheatmap) cM <- cM / Matrix::rowSums(cM) p5 <- pheatmap::pheatmap( mat = as.matrix(cM), color = paletteContinuous("whiteBlue"), border_color = "black" ) p5
Output:
Warning message: “package ‘pheatmap’ was built under R version 4.3.3”
Image output
markdown cell

6. Single-cell Embedding

code cell
In [11]:
projHeme2 <- addUMAP( ArchRProj = projHeme2, reducedDims = "IterativeLSI", name = "UMAP", nNeighbors = 30, minDist = 0.5, metric = "cosine" )
Output:
09:31:46 UMAP embedding parameters a = 0.583 b = 1.334 Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics' Also defined by ‘spam’ 09:31:46 Read 6185 rows and found 30 numeric columns 09:31:46 Using Annoy for neighbor search, n_neighbors = 30 Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics' Also defined by ‘spam’ 09:31:46 Building Annoy index with metric = cosine, n_trees = 50 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * | 09:31:48 Writing NN index file to temp file /tmp/RtmpJIsaLF/file91a850ab997b 09:31:48 Searching Annoy index using 20 threads, search_k = 3000 09:31:48 Annoy recall = 90.9% 09:31:49 Commencing smooth kNN distance calibration using 20 threads with target n_neighbors = 30 09:31:50 Initializing from normalized Laplacian + noise (using RSpectra) 09:31:50 Commencing optimization for 500 epochs, with 299900 positive edges 09:32:12 Optimization finished 09:32:12 Creating temp model dir /tmp/RtmpJIsaLF/dir91a817c363cf 09:32:12 Creating dir /tmp/RtmpJIsaLF/dir91a817c363cf 09:32:13 Changing to /tmp/RtmpJIsaLF/dir91a817c363cf 09:32:13 Creating /share2/pub/zhouyj/zhouyj/Human_eye_atlas/analysis/demo/Save-ProjHeme2/Embeddings/Save-Uwot-UMAP-Params-IterativeLSI-91a8276ff40d-Date-2025-10-30_Time-09-32-12.614686.tar
code cell
In [13]:
#Plot and save by sample and clusters p6 <- plotEmbedding(ArchRProj = projHeme2, colorBy = "cellColData", name = "Sample", embedding = "UMAP") p7 <- plotEmbedding(ArchRProj = projHeme2, colorBy = "cellColData", name = "Clusters", embedding = "UMAP") ggAlignPlots(p6, p7, type = "h") plotPDF(p6,p7, name = "Plot-UMAP-Sample-Clusters.pdf", ArchRProj = projHeme2, addDOC = FALSE, width = 5, height = 5)
Output:
ArchR logging to : ArchRLogs/ArchR-plotEmbedding-91a864fd6c48-Date-2025-10-30_Time-09-34-32.389676.log If there is an issue, please report to github with logFile! Getting UMAP Embedding ColorBy = cellColData Plotting Embedding 1 ArchR logging successful to : ArchRLogs/ArchR-plotEmbedding-91a864fd6c48-Date-2025-10-30_Time-09-34-32.389676.log ArchR logging to : ArchRLogs/ArchR-plotEmbedding-91a8669fb1a8-Date-2025-10-30_Time-09-34-32.864689.log If there is an issue, please report to github with logFile! Getting UMAP Embedding ColorBy = cellColData Plotting Embedding 1 ArchR logging successful to : ArchRLogs/ArchR-plotEmbedding-91a8669fb1a8-Date-2025-10-30_Time-09-34-32.864689.log Plotting Ggplot! Plotting Ggplot!
Image output
code cell
In [14]:
#Vesion after harmony projHeme2 <- addUMAP( ArchRProj = projHeme2, reducedDims = "Harmony", name = "UMAPHarmony", nNeighbors = 30, minDist = 0.5, metric = "cosine" )
Output:
09:36:55 UMAP embedding parameters a = 0.583 b = 1.334 Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics' Also defined by ‘spam’ 09:36:55 Read 6185 rows and found 30 numeric columns 09:36:55 Using Annoy for neighbor search, n_neighbors = 30 Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics' Also defined by ‘spam’ 09:36:55 Building Annoy index with metric = cosine, n_trees = 50 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * | 09:36:57 Writing NN index file to temp file /tmp/RtmpJIsaLF/file91a84363871a 09:36:57 Searching Annoy index using 20 threads, search_k = 3000 09:36:57 Annoy recall = 100% 09:36:58 Commencing smooth kNN distance calibration using 20 threads with target n_neighbors = 30 09:36:59 Initializing from normalized Laplacian + noise (using RSpectra) 09:36:59 Commencing optimization for 500 epochs, with 290260 positive edges 09:37:20 Optimization finished 09:37:20 Creating temp model dir /tmp/RtmpJIsaLF/dir91a87a4a6e04 09:37:20 Creating dir /tmp/RtmpJIsaLF/dir91a87a4a6e04 09:37:21 Changing to /tmp/RtmpJIsaLF/dir91a87a4a6e04 09:37:21 Creating /share2/pub/zhouyj/zhouyj/Human_eye_atlas/analysis/demo/Save-ProjHeme2/Embeddings/Save-Uwot-UMAP-Params-Harmony-91a87236768c-Date-2025-10-30_Time-09-37-20.579837.tar
code cell
In [15]:
p8 <- plotEmbedding(ArchRProj = projHeme2, colorBy = "cellColData", name = "Sample", embedding = "UMAPHarmony") p9 <- plotEmbedding(ArchRProj = projHeme2, colorBy = "cellColData", name = "Clusters", embedding = "UMAPHarmony") ggAlignPlots(p8, p9, type = "h") plotPDF(p8,p9, name = "Plot-UMAPHarmony-Sample-Clusters.pdf", ArchRProj = projHeme2, addDOC = FALSE, width = 5, height = 5)
Output:
ArchR logging to : ArchRLogs/ArchR-plotEmbedding-91a8465bb1fd-Date-2025-10-30_Time-09-37-38.971252.log If there is an issue, please report to github with logFile! Getting UMAP Embedding ColorBy = cellColData Plotting Embedding 1 ArchR logging successful to : ArchRLogs/ArchR-plotEmbedding-91a8465bb1fd-Date-2025-10-30_Time-09-37-38.971252.log ArchR logging to : ArchRLogs/ArchR-plotEmbedding-91a8586fdecc-Date-2025-10-30_Time-09-37-45.172277.log If there is an issue, please report to github with logFile! Getting UMAP Embedding ColorBy = cellColData Plotting Embedding 1 ArchR logging successful to : ArchRLogs/ArchR-plotEmbedding-91a8586fdecc-Date-2025-10-30_Time-09-37-45.172277.log Plotting Ggplot! Plotting Ggplot!
Image output
markdown cell

7. Gene Scores and Marker Genes

code cell
In [19]:
markersGS <- getMarkerFeatures( ArchRProj = projHeme2, useMatrix = "GeneScoreMatrix", groupBy = "Clusters", bias = c("TSSEnrichment", "log10(nFrags)"), testMethod = "wilcoxon" )
Output:
ArchR logging to : ArchRLogs/ArchR-getMarkerFeatures-91a831f7580f-Date-2025-10-30_Time-10-01-28.703741.log If there is an issue, please report to github with logFile! MatrixClass = Sparse.Double.Matrix 2025-10-30 10:01:34.688203 : Matching Known Biases, 0.007 mins elapsed. ########### 2025-10-30 10:02:20.264148 : Completed Pairwise Tests, 0.767 mins elapsed. ########### ArchR logging successful to : ArchRLogs/ArchR-getMarkerFeatures-91a831f7580f-Date-2025-10-30_Time-10-01-28.703741.log
code cell
In [18]:
projHeme2
Output:
___ .______ ______ __ __ .______ / \ | _ \ / || | | | | _ \ / ^ \ | |_) | | ,----'| |__| | | |_) | / /_\ \ | / | | | __ | | / / _____ \ | |\ \\___ | `----.| | | | | |\ \\___. /__/ \__\ | _| `._____| \______||__| |__| | _| `._____|
Result:
class: ArchRProject outputDirectory: /share2/pub/zhouyj/zhouyj/Human_eye_atlas/analysis/demo/Save-ProjHeme2 samples(2): fetal_eye_except_retina fetal_retina sampleColData names(1): ArrowFiles cellColData names(16): Sample TSSEnrichment ... BlacklistRatio Clusters numberOfCells(1): 6185 medianTSS(1): 17.885 medianFrags(1): 9945
code cell
In [20]:
markerList <- getMarkers(markersGS, cutOff = "FDR <= 0.01 & Log2FC >= 1.25") markerList$C1
Result:
DataFrame with 333 rows and 9 columns seqnames start end strand name idx Log2FC <Rle> <integer> <integer> <integer> <character> <integer> <numeric> 15911 chr22 37988794 37988853 1 MIR4534 340 3.08574 15909 chr22 37952607 38041106 1 POLR2F 338 2.39788 14341 chr2 222298996 222199888 2 PAX3 1392 2.71089 15910 chr22 37987422 37970686 2 SOX10 339 3.34180 11420 chr19 3539154 3544030 1 C19orf71 124 3.27196 ... ... ... ... ... ... ... ... 648 chr1 40317816 40300487 2 COL9A2 648 1.40391 17414 chr3 186635969 186653141 1 FETUB 1282 1.44268 9830 chr17 17813545 17813480 2 MIR6777 344 1.29721 24613 chrX 119565408 119538149 2 CXorf56 650 1.49323 2364 chr1 241132272 241132346 1 MIR3123 2364 2.71766 FDR MeanDiff <numeric> <numeric> 15911 7.92171e-48 2.560543 15909 6.27628e-47 1.762458 14341 6.87379e-44 1.080938 15910 2.83076e-34 0.922795 11420 9.84625e-34 1.537612 ... ... ... 648 0.00901485 0.313458 17414 0.00923812 0.073252 9830 0.00925712 0.353038 24613 0.00969600 0.203829 2364 0.00975850 0.740864
code cell
In [21]:
markerGenes <- gene_markers <- c( "CAND1","HES1","SOX2", ##RPCs "ATOH7","HES6","DLL3", ##NPCs "RGR","RLBP1","GPX3", ##MGCs "NRL","RHO","GNGT1", ##Rods "ONECUT2","ONECUT1", ##HCs "VSX1","TMEM215", ##BCs "GAD2","TFAP2A", ##ACs "GAP43","SNCG","NEFM", ##RGCs "DCN","MGP", ##Fibroblasts "PMEL","TYRP1", ##Melanocytes & RPE "RPE65","TTR", ##RPE "PDE6H","THRB","PRDM1", ##Cones "CRYGC","LIM2", ##Lens cells "C1QA","C1QB", ##Microphages "TMEM119","CX3CR1", ##Microglia "KRT5", ##Corneal epithelium "PECAM1","ENG" ## Endothelium ) heatmapGS <- markerHeatmap( seMarker = markersGS, cutOff = "FDR <= 0.01 & Log2FC >= 1.25", labelMarkers = markerGenes, transpose = TRUE )
Output:
Warning message: “'markerHeatmap' is deprecated. Use 'plotMarkerHeatmap' instead. See help("Deprecated")” ArchR logging to : ArchRLogs/ArchR-plotMarkerHeatmap-91a8425062b4-Date-2025-10-30_Time-10-07-58.306658.log If there is an issue, please report to github with logFile! Printing Top Marker Genes: C1: SSU72, TNFRSF14, RBP7, NPPA-AS1, AKR7L, MIR4425, TRIM63, TRNP1, PABPC4-AS1, COL9A2, RNU5F-1, MIR4711, LINC00466, IL12RB2, PROK1 C2: EPHA10, FAM78B, CAMK1D, PLEKHA7, LOC101928943, P3H3, MYO16, MEIS2, MIR8063, GSE1, TMEM132E, SLC1A6, SULT2A1, THRB, PROM1 C3: LINC01654, LOC101927342, OVGP1, LINC01750, C1orf68, LCE1F, IVL, SPRR4, FCRL4, FCRL2, CCDC181, LINC01699, KCNK2, SUSD4, OR2C3 C4: AKR7A3, LACTBL1, CNKSR1, DNAJC6, KIAA1107, LINC01356, FCRL5, KIF21B, MIR135B, USH2A, GPATCH2, SPHAR, LOC101927787, PRKCQ-AS1, C1QL3 C5: DISP3, ZDHHC18, MPL, MED8, KLF17, LINC01398, DMBX1, TTLL7, WDR47, SLC6A17, MAEL, LOC102724919, PDC, PKP1, KCNMA1-AS3 C6: CSF3R, NTNG1, CELF3, RIIAD1, CADM3, FCER1A, LHX4, RGS8, RGS18, DRD4, CEND1, SLC6A5, LINC00678, BDNF, SLC1A2 C7: LOC102724659, TIE1, BTBD19, CYP4B1, TAL1, S1PR1, LOC101928370, MIR7852, MOV10, S100A16, MIR8083, TPM3, SHE, TDRD10, DCST1 C8: TMEM132C, FAM87B, MIR6726, MIR6727, SSU72, PLCH2, TNFRSF14, RBP7, CORT, CASZ1, DISP3, LOC102724659, NPPA-AS1, LINC01654, AKR7L C9: CLMAT3, FAM87B, MIR6726, MIR6727, SSU72, PLCH2, TNFRSF14, RBP7, CORT, CASZ1, DISP3, LOC102724659, NPPA-AS1, LINC01654, AKR7L C10: MIR942, FAM99B, NAV2-AS5, ALKBH3-AS1, CCDC89, C1R, CSNK1A1L, MIR548X2, MIR337, MIR1276, FOXL1, MFAP4, ABCA8, ELANE, COL5A3 C11: FAM87B, MIR6727, CORT, FAM43B, COL8A2, NGF, S100A14, MIR214, PITRM1-AS1, GPRC5D-AS1, GLIPR1, SERTM1, TGM1, MIR342, USP50 C12: MIR6127, THEMIS2, DENND2D, CHI3L2, C1orf162, TMIGD3, CD1C, CD1E, CD48, ATP6V1G3, PTPRC, NLRP3, OR2T1, OR2T2, MRC1 C13: FAM181A, LINC00511, LINC00297, MIR4534, SLC35F1, FAM87B, MIR6726, MIR6727, SSU72, PLCH2, TNFRSF14, RBP7, CORT, CASZ1, DISP3 C14: PIFO, LMX1A, CACNA1E, LINC01031, SLIT1-AS1, LOC101927549, LMO1, CALCB, CALCA, PAX6, PAX6-AS1, PAUPAR, OR5A2, LGALS12, NTM-IT C15: BARHL2, LINC02609, IGSF9B, B3GAT1, LOC283177, SLC6A12, KRT77, DTX1, ONECUT3, CHST8, MYT1L-AS1, SLC32A1, LOC100129027, MAB21L2, LINC01287 Identified 1719 markers!
Output:
[1] "CAND1" "HES1" "SOX2" "ATOH7" "HES6" "DLL3" "RGR" [8] "RLBP1" "GPX3" "NRL" "RHO" "GNGT1" "ONECUT2" "ONECUT1" [15] "VSX1" "TMEM215" "GAD2" "TFAP2A" "GAP43" "SNCG" "NEFM" [22] "DCN" "MGP" "PMEL" "TYRP1" "RPE65" "TTR" "PDE6H" [29] "THRB" "PRDM1" "CRYGC" "LIM2" "C1QA" "C1QB" "TMEM119" [36] "CX3CR1" "KRT5" "PECAM1" "ENG"
Output:
Adding Annotations.. Preparing Main Heatmap.. ArchR logging successful to : ArchRLogs/ArchR-plotMarkerHeatmap-91a8425062b4-Date-2025-10-30_Time-10-07-58.306658.log
code cell
In [22]:
##Set order by @row_order #Exp: heatmapGS@row_order <- c(10,11,12,3,4,5,1,2,6,8,7,9) ComplexHeatmap::draw(heatmapGS, heatmap_legend_side = "bot", annotation_legend_side = "bot")
Image output
code cell
In [23]:
markerGenes <- c( "CCND1", ##RPCs "ATOH7", ##NPCs "GAP43", ##RGCs "ONECUT1", ##HCs "TFAP2A", ##ACs "PRDM1", ##Cones "NRL" ##Rods ) p8 <- plotEmbedding( ArchRProj = projHeme2, colorBy = "GeneScoreMatrix", name = markerGenes, embedding = "UMAPHarmony", quantCut = c(0.01, 0.95), imputeWeights = NULL )
Output:
ArchR logging to : ArchRLogs/ArchR-plotEmbedding-91a85b0b3c9b-Date-2025-10-30_Time-10-14-59.339542.log If there is an issue, please report to github with logFile! Getting UMAP Embedding ColorBy = GeneScoreMatrix Getting Matrix Values... 2025-10-30 10:15:07.568725 : Plotting Embedding 1 2 3 4 5 6 7 ArchR logging successful to : ArchRLogs/ArchR-plotEmbedding-91a85b0b3c9b-Date-2025-10-30_Time-10-14-59.339542.log
code cell
In [24]:
p8$CCND1
Image output
code cell
In [25]:
p9 <- lapply(p8, function(x){ x + guides(color = FALSE, fill = FALSE) + theme_ArchR(baseSize = 6.5) + theme(plot.margin = unit(c(0, 0, 0, 0), "cm")) + theme( axis.text.x=element_blank(), axis.ticks.x=element_blank(), axis.text.y=element_blank(), axis.ticks.y=element_blank() ) }) do.call(cowplot::plot_grid, c(list(ncol = 3),p9))
Output:
Warning message: “The `<scale>` argument of `guides()` cannot be `FALSE`. Use "none" instead as of ggplot2 3.3.4.”
Image output
code cell
In [26]:
#Marker Genes Imputation with MAGIC projHeme2 <- addImputeWeights(projHeme2) markerGenes <- c( "CCND1", ##RPCs "ATOH7", ##NPCs "GAP43", ##RGCs "ONECUT1", ##HCs "TFAP2A", ##ACs "PRDM1", ##Cones "NRL" ##Rods ) p10 <- plotEmbedding( ArchRProj = projHeme2, colorBy = "GeneScoreMatrix", name = markerGenes, embedding = "UMAPHarmony", quantCut = c(0.01, 0.95), imputeWeights = getImputeWeights(projHeme2) )
Output:
ArchR logging to : ArchRLogs/ArchR-addImputeWeights-91a878d0a9f-Date-2025-10-30_Time-10-17-45.350946.log If there is an issue, please report to github with logFile! 2025-10-30 10:17:45.519319 : Computing Impute Weights Using Magic (Cell 2018), 0 mins elapsed. Getting ImputeWeights ArchR logging to : ArchRLogs/ArchR-plotEmbedding-91a87fa0d8c7-Date-2025-10-30_Time-10-17-52.494176.log If there is an issue, please report to github with logFile! Getting UMAP Embedding ColorBy = GeneScoreMatrix Getting Matrix Values... 2025-10-30 10:17:52.894195 : Imputing Matrix Using weights on disk 1 of 1 Using weights on disk 1 of 1 Plotting Embedding 1 2 3 4 5 6 7 ArchR logging successful to : ArchRLogs/ArchR-plotEmbedding-91a87fa0d8c7-Date-2025-10-30_Time-10-17-52.494176.log
code cell
In [27]:
p10$CCND1
Image output
code cell
In [28]:
p11 <- lapply(p10, function(x){ x + guides(color = FALSE, fill = FALSE) + theme_ArchR(baseSize = 6.5) + theme(plot.margin = unit(c(0, 0, 0, 0), "cm")) + theme( axis.text.x=element_blank(), axis.ticks.x=element_blank(), axis.text.y=element_blank(), axis.ticks.y=element_blank() ) }) do.call(cowplot::plot_grid, c(list(ncol = 3),p11))
Image output
code cell
In [42]:
#Module Scores #Each group needs atleast 2 genes #features <- list( # RPCsScore = c("CAND1","HES1","SOX2"), # NPCsScore = c("ATOH7","HES6","DLL3"), # MGCsScore = c("RGR","RLBP1","GPX3"), # RodsScore = c("NRL","RHO","GNGT1"), # HCscore = c("ONECUT2","ONECUT1"), # BCscore = c("VSX1","TMEM215"), # ACscore = c("GAD2","TFAP2A"), # RGCsScore = c("GAP43","SNCG","NEFM"), # FibroblastsScore = c("DCN","MGP"), # MelanocytesRPEScore = c("PMEL","TYRP1"), # RPEScore = c("RPE65","TTR"), # ConesScore = c("PDE6H","THRB","PRDM1"), # LensCellsScore = c("CRYGC","LIM2"), # MicrophagesScore = c("C1QA","C1QB"), # MicrogliaScore = c("TMEM119","CX3CR1"), # EndotheliumScore = c("PECAM1","ENG") #) features <- list( RPCsScore = c("CAND1","HES1","SOX2") ) projHeme2 <- addModuleScore(projHeme2, useMatrix = "GeneScoreMatrix", name = "Module", features = features)
Output:
ArchR logging to : ArchRLogs/ArchR-addModuleScore-91a81df8a4af-Date-2025-10-30_Time-10-44-41.57271.log If there is an issue, please report to github with logFile! 2025-10-30 10:44:43.343194 : Computing Module 1 of 1, 0.026 mins elapsed. Overriding previous entry for Module.RPCsScore 2025-10-30 10:44:51.719661 : Finished Running addModuleScore, 0.166 mins elapsed.
code cell
In [43]:
p12 <- plotEmbedding(projHeme2, embedding = "UMAPHarmony", colorBy = "cellColData", name="Module.RPCsScore", imputeWeights = getImputeWeights(projHeme2))
Output:
Getting ImputeWeights ArchR logging to : ArchRLogs/ArchR-plotEmbedding-91a82e3b21fc-Date-2025-10-30_Time-10-44-51.745916.log If there is an issue, please report to github with logFile! Getting UMAP Embedding ColorBy = cellColData Imputing Matrix Using weights on disk 1 of 1 Using weights on disk 1 of 1 Plotting Embedding 1 ArchR logging successful to : ArchRLogs/ArchR-plotEmbedding-91a82e3b21fc-Date-2025-10-30_Time-10-44-51.745916.log
code cell
In [44]:
p12
Image output
code cell
In [39]:
#Track Plotting set.seed(1234) markerGenes <- c( "CCND1", ##RPCs "ATOH7", ##NPCs "GAP43", ##RGCs "ONECUT1", ##HCs "TFAP2A", ##ACs "PRDM1", ##Cones "NRL" ##Rods ) p <- plotBrowserTrack( ArchRProj = projHeme2, groupBy = "Clusters", geneSymbol = markerGenes, upstream = 50000, downstream = 50000 )
Output:
ArchR logging to : ArchRLogs/ArchR-plotBrowserTrack-91a8e0dc389-Date-2025-10-30_Time-10-41-27.542857.log If there is an issue, please report to github with logFile! 2025-10-30 10:41:27.715202 : Validating Region, 0.003 mins elapsed.
Output:
GRanges object with 7 ranges and 2 metadata columns: seqnames ranges strand | gene_id symbol <Rle> <IRanges> <Rle> | <character> <character> [1] chr11 69641087-69654474 + | 595 CCND1 [2] chr10 68230624-68232103 - | 220202 ATOH7 [3] chr3 115623324-115721490 + | 2596 GAP43 [4] chr15 52756989-52791078 - | 3175 ONECUT1 [5] chr6 10393186-10419659 - | 7020 TFAP2A [6] chr6 106086320-106109939 + | 639 PRDM1 [7] chr14 24080107-24115014 - | 4901 NRL ------- seqinfo: 24 sequences from hg38 genome
Output:
2025-10-30 10:41:27.844366 : Adding Bulk Tracks (1 of 7), 0.005 mins elapsed. 2025-10-30 10:41:28.942532 : Adding Gene Tracks (1 of 7), 0.023 mins elapsed. 2025-10-30 10:41:29.423961 : Plotting, 0.031 mins elapsed. 2025-10-30 10:41:31.003763 : Adding Bulk Tracks (2 of 7), 0.058 mins elapsed. 2025-10-30 10:41:31.868226 : Adding Gene Tracks (2 of 7), 0.072 mins elapsed. 2025-10-30 10:41:32.125812 : Plotting, 0.076 mins elapsed. 2025-10-30 10:41:33.399572 : Adding Bulk Tracks (3 of 7), 0.098 mins elapsed. 2025-10-30 10:41:34.412379 : Adding Gene Tracks (3 of 7), 0.115 mins elapsed. 2025-10-30 10:41:34.660011 : Plotting, 0.119 mins elapsed. 2025-10-30 10:41:35.881685 : Adding Bulk Tracks (4 of 7), 0.139 mins elapsed. 2025-10-30 10:41:36.659171 : Adding Gene Tracks (4 of 7), 0.152 mins elapsed. 2025-10-30 10:41:36.9099 : Plotting, 0.156 mins elapsed. 2025-10-30 10:41:38.170023 : Adding Bulk Tracks (5 of 7), 0.177 mins elapsed. 2025-10-30 10:41:39.095898 : Adding Gene Tracks (5 of 7), 0.193 mins elapsed. 2025-10-30 10:41:39.363946 : Plotting, 0.197 mins elapsed. 2025-10-30 10:41:40.773479 : Adding Bulk Tracks (6 of 7), 0.221 mins elapsed. 2025-10-30 10:41:41.702989 : Adding Gene Tracks (6 of 7), 0.236 mins elapsed. 2025-10-30 10:41:41.961645 : Plotting, 0.24 mins elapsed. 2025-10-30 10:41:43.310858 : Adding Bulk Tracks (7 of 7), 0.263 mins elapsed. 2025-10-30 10:41:44.239631 : Adding Gene Tracks (7 of 7), 0.278 mins elapsed. 2025-10-30 10:41:44.504083 : Plotting, 0.283 mins elapsed. ArchR logging successful to : ArchRLogs/ArchR-plotBrowserTrack-91a8e0dc389-Date-2025-10-30_Time-10-41-27.542857.log
code cell
In [40]:
grid::grid.newpage() grid::grid.draw(p$CCND1) plotPDF(plotList = p, name = "Plot-Tracks-Marker-Genes.pdf", ArchRProj = projHeme2, addDOC = FALSE, width = 5, height = 5)
Output:
Plotting Gtable!
Output:
NULL
Output:
Plotting Gtable!
Output:
NULL
Output:
Plotting Gtable!
Output:
NULL
Output:
Plotting Gtable!
Output:
NULL
Output:
Plotting Gtable!
Output:
NULL
Output:
Plotting Gtable!
Output:
NULL
Output:
Plotting Gtable!
Output:
NULL
Image output
markdown cell

8. Defining Cluster Identity with scRNA-seq

code cell
In [46]:
#Cross-platform linkage of scATAC-seq cells with scRNA-seq cells #sc/snRNA-seq data needs Seurat v4 or SingleCellExperiment seRNA <- readRDS("/share2/pub/zhouyj/zhouyj/Human_eye_atlas/analysis/demo/data/hro_rna_final_anno.rds") projHeme2 <- addGeneIntegrationMatrix( ArchRProj = projHeme2, useMatrix = "GeneScoreMatrix", matrixName = "GeneIntegrationMatrix", reducedDims = "IterativeLSI", seRNA = seRNA, addToArrow = FALSE, groupRNA = "Cell_type", nameCell = "predictedCell_Un", nameGroup = "predictedGroup_Un", nameScore = "predictedScore_Un" )
Output:
ArchR logging to : ArchRLogs/ArchR-addGeneIntegrationMatrix-91a81b9703b5-Date-2025-10-31_Time-17-16-14.50102.log If there is an issue, please report to github with logFile! 2025-10-31 17:16:14.650241 : Running Seurat's Integration Stuart* et al 2019, 0.002 mins elapsed. 2025-10-31 17:16:14.676977 : Checking ATAC Input, 0.003 mins elapsed. 2025-10-31 17:16:14.895387 : Checking RNA Input, 0.007 mins elapsed. 2025-10-31 17:16:20.153665 : Found 21369 overlapping gene names from gene scores and rna matrix!, 0.094 mins elapsed. 2025-10-31 17:16:20.156279 : Creating Integration Blocks, 0.094 mins elapsed. 2025-10-31 17:16:20.453998 : Prepping Interation Data, 0.099 mins elapsed. subThreading Disabled since ArchRLocking is TRUE see `addArchRLocking` 2025-10-31 17:16:20.998352 : Computing Integration in 1 Integration Blocks!, 0 mins elapsed. 2025-10-31 17:16:21.001557 : Block (1 of 1) : Computing Integration, 0 mins elapsed. 2025-10-31 17:16:24.872289 : Block (1 of 1) : Identifying Variable Genes, 0.065 mins elapsed. 2025-10-31 17:16:27.485666 : Block (1 of 1) : Getting GeneScoreMatrix, 0.108 mins elapsed. 2025-10-31 17:16:37.594323 : Block (1 of 1) : Imputing GeneScoreMatrix, 0.277 mins elapsed. Getting ImputeWeights 2025-10-31 17:17:17.107258 : Block (1 of 1) : Seurat FindTransferAnchors, 0.935 mins elapsed. 2025-10-31 17:17:34.653529 : Block (1 of 1) : Seurat TransferData Cell Group Labels, 1.228 mins elapsed. 2025-10-31 17:17:36.469198 : Block (1 of 1) : Seurat TransferData Cell Names Labels, 1.258 mins elapsed. 2025-10-31 17:17:49.629022 : Block (1 of 1) : Saving TransferAnchors Joint CCA, 1.477 mins elapsed. 2025-10-31 17:17:51.429326 : Block (1 of 1) : Completed Integration, 1.507 mins elapsed. 2025-10-31 17:17:52.821755 : Block (1 of 1) : Plotting Joint UMAP, 1.53 mins elapsed. Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics' Also defined by ‘spam’ Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics' Also defined by ‘spam’ 2025-10-31 17:18:30.471151 : Completed Integration with RNA Matrix, 2.158 mins elapsed. ArchR logging successful to : ArchRLogs/ArchR-addGeneIntegrationMatrix-91a81b9703b5-Date-2025-10-31_Time-17-16-14.50102.log
code cell
In [61]:
cM <- as.matrix(confusionMatrix(projHeme2$Clusters, projHeme2$predictedGroup_Un)) preClust <- colnames(cM)[apply(cM, 1 , which.max)] cblist <- cbind(preClust, rownames(cM)) %>% as.data.frame() #Assignments colnames(cblist) <- c('RNA_anno','ATAC_clusters') cblist
A data.frame: 15 × 2
RNA_annoATAC_clusters
<chr><chr>
BCs C3
Corneal epitheliumC1
ACs C12
Rods C5
Fibroblasts C10
Fibroblasts C8
Fibroblasts C11
Rods C4
BCs C6
Microglia C7
RPE C14
Fibroblasts C13
ACs C15
Cones C2
RPCs C9
code cell
In [68]:
celltype_groups <- list( "Photoreceptors" = c("Cones", "Rods", "BCs"), # 光感受器及相关 "Retinal_Neurons" = c("HCs", "ACs", "RGCs"), # 其他视网膜神经元 "Glial_Cells" = c("MGCs", "Microglia", "Microphages"), # 胶质细胞 "Epithelial_Cells" = c("RPE", "Corneal epithelium", "Lens cells"), # 上皮细胞 "Progenitor_Cells" = c("RPCs", "NPCs"), # 前体/祖细胞 "Stromal_Cells" = c("Fibroblasts") # 间质细胞 ) clustPhotoreceptors <- cblist %>% subset(RNA_anno %in% celltype_groups$Photoreceptors) %>% pull(ATAC_clusters) clustRetinal_Neurons <- cblist %>% subset(RNA_anno %in% celltype_groups$Retinal_Neurons) %>% pull(ATAC_clusters) clustGlial_Cells <- cblist %>% subset(RNA_anno %in% celltype_groups$Glial_Cells) %>% pull(ATAC_clusters) clustEpithelial_Cells <- cblist %>% subset(RNA_anno %in% celltype_groups$Epithelial_Cells) %>% pull(ATAC_clusters) clustProgenitor_Cells <- cblist %>% subset(RNA_anno %in% celltype_groups$Progenitor_Cells) %>% pull(ATAC_clusters) clustStromal_Cells <- cblist %>% subset(RNA_anno %in% celltype_groups$Stromal_Cells) %>% pull(ATAC_clusters) rnaPhotoreceptors <- seRNA %>% subset(Cell_type %in% celltype_groups$Photoreceptors) %>% colnames() rnaRetinal_Neurons <- seRNA %>% subset(Cell_type %in% celltype_groups$Retinal_Neurons) %>% colnames() rnaGlial_Cells <- seRNA %>% subset(Cell_type %in% celltype_groups$Glial_Cells) %>% colnames() rnaEpithelial_Cells <- seRNA %>% subset(Cell_type %in% celltype_groups$Epithelial_Cells) %>% colnames() rnaProgenitor_Cells <- seRNA %>% subset(Cell_type %in% celltype_groups$Progenitor_Cells) %>% colnames() rnaStromal_Cells <- seRNA %>% subset(Cell_type %in% celltype_groups$Stromal_Cells) %>% colnames() groupList <- SimpleList( Photoreceptors = SimpleList( ATAC = projHeme2$cellNames[projHeme2$Clusters %in% clustPhotoreceptors], RNA = rnaPhotoreceptors ), Retinal_Neurons = SimpleList( ATAC = projHeme2$cellNames[projHeme2$Clusters %in% clustRetinal_Neurons], RNA = rnaRetinal_Neurons ) , Glial_Cells = SimpleList( ATAC = projHeme2$cellNames[projHeme2$Clusters %in% clustGlial_Cells], RNA = rnaGlial_Cells ) , Epithelial_Cells = SimpleList( ATAC = projHeme2$cellNames[projHeme2$Clusters %in% clustEpithelial_Cells], RNA = rnaEpithelial_Cells ) , Progenitor_Cells = SimpleList( ATAC = projHeme2$cellNames[projHeme2$Clusters %in% clustProgenitor_Cells], RNA = rnaProgenitor_Cells ) , Stromal_Cells = SimpleList( ATAC = projHeme2$cellNames[projHeme2$Clusters %in% clustStromal_Cells], RNA = rnaStromal_Cells ) )
code cell
In [69]:
#Adding Pseudo-scRNA-seq profiles for each scATAC-seq cell projHeme3 <- addGeneIntegrationMatrix( ArchRProj = projHeme2, useMatrix = "GeneScoreMatrix", matrixName = "GeneIntegrationMatrix", reducedDims = "IterativeLSI", seRNA = seRNA, addToArrow = TRUE, force= TRUE, groupList = groupList, groupRNA = "Cell_type", nameCell = "predictedCell", nameGroup = "predictedGroup", nameScore = "predictedScore" )
Output:
ArchR logging to : ArchRLogs/ArchR-addGeneIntegrationMatrix-91a864e5a7da-Date-2025-10-31_Time-17-52-05.274252.log If there is an issue, please report to github with logFile! 2025-10-31 17:52:05.438426 : Running Seurat's Integration Stuart* et al 2019, 0.003 mins elapsed. 2025-10-31 17:52:05.460565 : Checking ATAC Input, 0.003 mins elapsed. 2025-10-31 17:52:05.719466 : Checking RNA Input, 0.007 mins elapsed. 2025-10-31 17:52:10.959871 : Found 21369 overlapping gene names from gene scores and rna matrix!, 0.095 mins elapsed. 2025-10-31 17:52:10.962484 : Creating Integration Blocks, 0.095 mins elapsed. 2025-10-31 17:52:11.163365 : Prepping Interation Data, 0.098 mins elapsed. subThreading Disabled since ArchRLocking is TRUE see `addArchRLocking` 2025-10-31 17:52:11.699249 : Computing Integration in 6 Integration Blocks!, 0 mins elapsed. 2025-10-31 17:52:11.706991 : Block (1 of 6) : Computing Integration, 0 mins elapsed. 2025-10-31 17:52:15.239796 : Block (1 of 6) : Identifying Variable Genes, 0.059 mins elapsed. 2025-10-31 17:52:17.644931 : Block (1 of 6) : Getting GeneScoreMatrix, 0.099 mins elapsed. 2025-10-31 17:52:23.444779 : Block (1 of 6) : Imputing GeneScoreMatrix, 0.196 mins elapsed. Getting ImputeWeights 2025-10-31 17:52:25.743501 : Block (1 of 6) : Seurat FindTransferAnchors, 0.234 mins elapsed. 2025-10-31 17:52:36.354176 : Block (1 of 6) : Seurat TransferData Cell Group Labels, 0.411 mins elapsed. 2025-10-31 17:52:36.791763 : Block (1 of 6) : Seurat TransferData Cell Names Labels, 0.418 mins elapsed. 2025-10-31 17:52:39.999354 : Block (1 of 6) : Seurat TransferData GeneMatrix, 0.472 mins elapsed. 2025-10-31 17:52:42.303766 : Block (1 of 6) : Saving TransferAnchors Joint CCA, 0.51 mins elapsed. 2025-10-31 17:52:43.409859 : Block (1 of 6) : Transferring Paired RNA to Temp File, 0.529 mins elapsed. 2025-10-31 17:52:44.873069 : Block (1 of 6) : Completed Integration, 0.553 mins elapsed. 2025-10-31 17:52:45.872324 : Block (2 of 6) : Computing Integration, 0.57 mins elapsed. 2025-10-31 17:52:49.095483 : Block (2 of 6) : Identifying Variable Genes, 0.623 mins elapsed. 2025-10-31 17:52:51.537902 : Block (2 of 6) : Getting GeneScoreMatrix, 0.664 mins elapsed. 2025-10-31 17:52:56.345476 : Block (2 of 6) : Imputing GeneScoreMatrix, 0.744 mins elapsed. Getting ImputeWeights 2025-10-31 17:52:57.888306 : Block (2 of 6) : Seurat FindTransferAnchors, 0.77 mins elapsed. 2025-10-31 17:53:07.39262 : Block (2 of 6) : Seurat TransferData Cell Group Labels, 0.928 mins elapsed. 2025-10-31 17:53:07.667436 : Block (2 of 6) : Seurat TransferData Cell Names Labels, 0.933 mins elapsed. 2025-10-31 17:53:09.737132 : Block (2 of 6) : Seurat TransferData GeneMatrix, 0.967 mins elapsed. 2025-10-31 17:53:11.580403 : Block (2 of 6) : Saving TransferAnchors Joint CCA, 0.998 mins elapsed. 2025-10-31 17:53:12.67559 : Block (2 of 6) : Transferring Paired RNA to Temp File, 1.016 mins elapsed. 2025-10-31 17:53:13.669618 : Block (2 of 6) : Completed Integration, 1.033 mins elapsed. 2025-10-31 17:53:14.562255 : Block (3 of 6) : Computing Integration, 1.048 mins elapsed. 2025-10-31 17:53:16.014731 : Block (3 of 6) : Identifying Variable Genes, 1.072 mins elapsed. 2025-10-31 17:53:16.996151 : Block (3 of 6) : Getting GeneScoreMatrix, 1.088 mins elapsed. 2025-10-31 17:53:21.556036 : Block (3 of 6) : Imputing GeneScoreMatrix, 1.164 mins elapsed. Getting ImputeWeights 2025-10-31 17:53:22.715645 : Block (3 of 6) : Seurat FindTransferAnchors, 1.184 mins elapsed. 2025-10-31 17:53:25.507706 : Block (3 of 6) : Seurat TransferData Cell Group Labels, 1.23 mins elapsed. 2025-10-31 17:53:25.609681 : Block (3 of 6) : Seurat TransferData Cell Names Labels, 1.232 mins elapsed. 2025-10-31 17:53:25.739693 : Block (3 of 6) : Seurat TransferData GeneMatrix, 1.234 mins elapsed. 2025-10-31 17:53:26.971188 : Block (3 of 6) : Saving TransferAnchors Joint CCA, 1.255 mins elapsed. 2025-10-31 17:53:27.904337 : Block (3 of 6) : Transferring Paired RNA to Temp File, 1.27 mins elapsed. 2025-10-31 17:53:28.311682 : Block (3 of 6) : Completed Integration, 1.277 mins elapsed. 2025-10-31 17:53:29.211357 : Block (4 of 6) : Computing Integration, 1.292 mins elapsed. 2025-10-31 17:53:30.948436 : Block (4 of 6) : Identifying Variable Genes, 1.321 mins elapsed. 2025-10-31 17:53:32.144675 : Block (4 of 6) : Getting GeneScoreMatrix, 1.341 mins elapsed. 2025-10-31 17:53:37.116117 : Block (4 of 6) : Imputing GeneScoreMatrix, 1.424 mins elapsed. Getting ImputeWeights 2025-10-31 17:53:38.510089 : Block (4 of 6) : Seurat FindTransferAnchors, 1.447 mins elapsed. 2025-10-31 17:53:43.357619 : Block (4 of 6) : Seurat TransferData Cell Group Labels, 1.528 mins elapsed. 2025-10-31 17:53:43.548175 : Block (4 of 6) : Seurat TransferData Cell Names Labels, 1.531 mins elapsed. 2025-10-31 17:53:43.915248 : Block (4 of 6) : Seurat TransferData GeneMatrix, 1.537 mins elapsed. 2025-10-31 17:53:45.41506 : Block (4 of 6) : Saving TransferAnchors Joint CCA, 1.562 mins elapsed. 2025-10-31 17:53:46.37791 : Block (4 of 6) : Transferring Paired RNA to Temp File, 1.578 mins elapsed. 2025-10-31 17:53:47.148634 : Block (4 of 6) : Completed Integration, 1.591 mins elapsed. 2025-10-31 17:53:48.042885 : Block (5 of 6) : Computing Integration, 1.606 mins elapsed. 2025-10-31 17:53:51.579289 : Block (5 of 6) : Identifying Variable Genes, 1.665 mins elapsed. 2025-10-31 17:53:54.128001 : Block (5 of 6) : Getting GeneScoreMatrix, 1.707 mins elapsed. 2025-10-31 17:53:58.865273 : Block (5 of 6) : Imputing GeneScoreMatrix, 1.786 mins elapsed. Getting ImputeWeights 2025-10-31 17:53:59.941153 : Block (5 of 6) : Seurat FindTransferAnchors, 1.804 mins elapsed. 2025-10-31 17:54:08.924143 : Block (5 of 6) : Seurat TransferData Cell Group Labels, 1.954 mins elapsed. 2025-10-31 17:54:08.996598 : Block (5 of 6) : Seurat TransferData Cell Names Labels, 1.955 mins elapsed. 2025-10-31 17:54:10.218672 : Block (5 of 6) : Seurat TransferData GeneMatrix, 1.975 mins elapsed. 2025-10-31 17:54:11.452084 : Block (5 of 6) : Saving TransferAnchors Joint CCA, 1.996 mins elapsed. 2025-10-31 17:54:12.566777 : Block (5 of 6) : Transferring Paired RNA to Temp File, 2.014 mins elapsed. 2025-10-31 17:54:12.859383 : Block (5 of 6) : Completed Integration, 2.019 mins elapsed. 2025-10-31 17:54:13.758763 : Block (6 of 6) : Computing Integration, 2.034 mins elapsed. 2025-10-31 17:54:15.916436 : Block (6 of 6) : Identifying Variable Genes, 2.07 mins elapsed. 2025-10-31 17:54:17.411086 : Block (6 of 6) : Getting GeneScoreMatrix, 2.095 mins elapsed. 2025-10-31 17:54:24.692338 : Block (6 of 6) : Imputing GeneScoreMatrix, 2.217 mins elapsed. Getting ImputeWeights 2025-10-31 17:54:30.731245 : Block (6 of 6) : Seurat FindTransferAnchors, 2.317 mins elapsed. 2025-10-31 17:54:38.558136 : Block (6 of 6) : Seurat TransferData Cell Group Labels, 2.448 mins elapsed. 2025-10-31 17:54:39.653628 : Block (6 of 6) : Seurat TransferData Cell Names Labels, 2.466 mins elapsed. 2025-10-31 17:54:42.022363 : Block (6 of 6) : Seurat TransferData GeneMatrix, 2.505 mins elapsed. 2025-10-31 17:54:45.889776 : Block (6 of 6) : Saving TransferAnchors Joint CCA, 2.57 mins elapsed. 2025-10-31 17:54:46.986678 : Block (6 of 6) : Transferring Paired RNA to Temp File, 2.588 mins elapsed. 2025-10-31 17:54:50.423219 : Block (6 of 6) : Completed Integration, 2.645 mins elapsed. 2025-10-31 17:54:51.406242 : Block (1 of 6) : Plotting Joint UMAP, 2.662 mins elapsed. Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics' Also defined by ‘spam’ Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics' Also defined by ‘spam’ 2025-10-31 17:55:18.317399 : Block (2 of 6) : Plotting Joint UMAP, 3.11 mins elapsed. Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics' Also defined by ‘spam’ Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics' Also defined by ‘spam’ 2025-10-31 17:55:42.952962 : Block (3 of 6) : Plotting Joint UMAP, 3.521 mins elapsed. Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics' Also defined by ‘spam’ Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics' Also defined by ‘spam’ 2025-10-31 17:55:51.823908 : Block (4 of 6) : Plotting Joint UMAP, 3.669 mins elapsed. Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics' Also defined by ‘spam’ Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics' Also defined by ‘spam’ 2025-10-31 17:56:07.015005 : Block (5 of 6) : Plotting Joint UMAP, 3.922 mins elapsed. Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics' Also defined by ‘spam’ Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics' Also defined by ‘spam’ 2025-10-31 17:56:31.598351 : Block (6 of 6) : Plotting Joint UMAP, 4.332 mins elapsed. Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics' Also defined by ‘spam’ Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics' Also defined by ‘spam’ 2025-10-31 17:57:03.055419 : Transferring Data to ArrowFiles, 4.856 mins elapsed. 2025-10-31 17:57:03.0737 : fetal_eye_except_retina (1 of 2) Getting GeneIntegrationMatrix From TempFiles!, 4.856 mins elapsed. 2025-10-31 17:57:10.514187 : fetal_eye_except_retina (1 of 2) Adding GeneIntegrationMatrix to ArrowFile!, 4.98 mins elapsed. 2025-10-31 17:58:04.15675 : fetal_retina (2 of 2) Getting GeneIntegrationMatrix From TempFiles!, 5.874 mins elapsed. 2025-10-31 17:58:05.806291 : fetal_retina (2 of 2) Adding GeneIntegrationMatrix to ArrowFile!, 5.902 mins elapsed. 2025-10-31 17:58:39.959272 : Completed Integration with RNA Matrix, 6.471 mins elapsed. ArchR logging successful to : ArchRLogs/ArchR-addGeneIntegrationMatrix-91a864e5a7da-Date-2025-10-31_Time-17-52-05.274252.log
code cell
In [70]:
p13 <- plotEmbedding( ArchRProj = projHeme3, colorBy = "GeneIntegrationMatrix", name = markerGenes, continuousSet = "horizonExtra", embedding = "UMAP", imputeWeights = getImputeWeights(projHeme3) ) p14 <- plotEmbedding( ArchRProj = projHeme3, colorBy = "GeneScoreMatrix", continuousSet = "horizonExtra", name = markerGenes, embedding = "UMAP", imputeWeights = getImputeWeights(projHeme3) )
Output:
Getting ImputeWeights ArchR logging to : ArchRLogs/ArchR-plotEmbedding-91a8dd421cb-Date-2025-10-31_Time-17-58-40.089852.log If there is an issue, please report to github with logFile! Getting UMAP Embedding ColorBy = GeneIntegrationMatrix Getting Matrix Values... 2025-10-31 17:58:40.651754 : Imputing Matrix Using weights on disk 1 of 1 Using weights on disk 1 of 1 Plotting Embedding 1 2 3 4 5 6 7 ArchR logging successful to : ArchRLogs/ArchR-plotEmbedding-91a8dd421cb-Date-2025-10-31_Time-17-58-40.089852.log Getting ImputeWeights ArchR logging to : ArchRLogs/ArchR-plotEmbedding-91a847797697-Date-2025-10-31_Time-17-58-48.481978.log If there is an issue, please report to github with logFile! Getting UMAP Embedding ColorBy = GeneScoreMatrix Getting Matrix Values... 2025-10-31 17:58:49.094289 : Imputing Matrix Using weights on disk 1 of 1 Using weights on disk 1 of 1 Plotting Embedding 1 2 3 4 5 6 7 ArchR logging successful to : ArchRLogs/ArchR-plotEmbedding-91a847797697-Date-2025-10-31_Time-17-58-48.481978.log
code cell
In [71]:
#Labeling scATAC-seq clusters with scRNA-seq information cM <- confusionMatrix(projHeme3$Clusters, projHeme3$predictedGroup) labelOld <- rownames(cM) labelOld
  1. 'C3'
  2. 'C1'
  3. 'C12'
  4. 'C5'
  5. 'C10'
  6. 'C8'
  7. 'C11'
  8. 'C4'
  9. 'C6'
  10. 'C7'
  11. 'C14'
  12. 'C13'
  13. 'C15'
  14. 'C2'
  15. 'C9'
code cell
In [72]:
labelNew <- colnames(cM)[apply(cM, 1, which.max)] labelNew
  1. 'Rods'
  2. 'RPE'
  3. 'ACs'
  4. 'Rods'
  5. 'Fibroblasts'
  6. 'Fibroblasts'
  7. 'Fibroblasts'
  8. 'Rods'
  9. 'BCs'
  10. 'Microphages'
  11. 'RPE'
  12. 'Fibroblasts'
  13. 'ACs'
  14. 'Cones'
  15. 'RPCs'
code cell
In [73]:
projHeme3$Clusters2 <- mapLabels(projHeme3$Clusters, newLabels = labelNew, oldLabels = labelOld)
code cell
In [74]:
p15 <- plotEmbedding(projHeme3, colorBy = "cellColData", name = "Clusters2") p15
Output:
ArchR logging to : ArchRLogs/ArchR-plotEmbedding-91a848c0abc6-Date-2025-10-31_Time-17-59-46.285521.log If there is an issue, please report to github with logFile! Getting UMAP Embedding ColorBy = cellColData Plotting Embedding 1 ArchR logging successful to : ArchRLogs/ArchR-plotEmbedding-91a848c0abc6-Date-2025-10-31_Time-17-59-46.285521.log
Image output
code cell
In [75]:
projHeme3 <- saveArchRProject(ArchRProj = projHeme3, outputDirectory = "Save-ProjHeme3", load = TRUE)
Output:
Copying ArchRProject to new outputDirectory : /share2/pub/zhouyj/zhouyj/Human_eye_atlas/analysis/demo/Save-ProjHeme3 Copying Arrow Files... Copying Arrow Files (1 of 2) Copying Arrow Files (2 of 2) Getting ImputeWeights Dropping ImputeWeights... Copying Other Files... Copying Other Files (1 of 4): Embeddings Copying Other Files (2 of 4): IterativeLSI Copying Other Files (3 of 4): Plots Copying Other Files (4 of 4): RNAIntegration Saving ArchRProject... Loading ArchRProject... Successfully loaded ArchRProject! / | / \ . / |. \\\ / |. \\\ / `|. \\\ / |. \ / |\ \\#####\ / || ==###########> / || \\##==......\ / || ______ = =|__ /__ || \\\ ,--' ,----`-,__ ___/' --,-`-===================##========> \ ' ##_______ _____ ,--,__,=##,__ /// , __== ___,-,__,--'#' ===' `-' | ##,-/ -,____,---' \\####\\________________,--\\_##,/ ___ .______ ______ __ __ .______ / \ | _ \ / || | | | | _ \ / ^ \ | |_) | | ,----'| |__| | | |_) | / /_\ \ | / | | | __ | | / / _____ \ | |\ \\___ | `----.| | | | | |\ \\___. /__/ \__\ | _| `._____| \______||__| |__| | _| `._____|
code cell
rm(projHeme2) gc()
markdown cell

9. Making Pseudo-bulk Replicates

code cell
In [77]:
library(BSgenome.Hsapiens.UCSC.hg38) projHeme4 <- addGroupCoverages(ArchRProj = projHeme3, groupBy = "Clusters2")
Output:
Loading required package: BSgenome Loading required package: BiocIO Loading required package: rtracklayer Warning message: “package ‘rtracklayer’ was built under R version 4.3.3” Attaching package: ‘rtracklayer’ The following object is masked from ‘package:BiocIO’: FileForFormat ArchR logging to : ArchRLogs/ArchR-addGroupCoverages-91a856f6d087-Date-2025-10-31_Time-18-11-20.553077.log If there is an issue, please report to github with logFile! ACs (1 of 8) : CellGroups N = 2 BCs (2 of 8) : CellGroups N = 2 Cones (3 of 8) : CellGroups N = 2 Fibroblasts (4 of 8) : CellGroups N = 2 Microphages (5 of 8) : CellGroups N = 2 Rods (6 of 8) : CellGroups N = 2 RPCs (7 of 8) : CellGroups N = 2 RPE (8 of 8) : CellGroups N = 2 2025-10-31 18:11:24.733634 : Creating Coverage Files!, 0.07 mins elapsed. 2025-10-31 18:11:24.736996 : Batch Execution w/ safelapply!, 0.07 mins elapsed. 2025-10-31 18:11:24.741147 : Group ACs._.fetal_eye_except_retina (1 of 16) : Creating Group Coverage File : ACs._.fetal_eye_except_retina.insertions.coverage.h5, 0.07 mins elapsed. Number of Cells = 500 Coverage File Exists! Added Coverage Group Added Metadata Group Added ArrowCoverage Class Added Coverage/Info Added Coverage/Info/CellNames 2025-10-31 18:11:54.465145 : Group ACs._.fetal_retina (2 of 16) : Creating Group Coverage File : ACs._.fetal_retina.insertions.coverage.h5, 0.565 mins elapsed. Number of Cells = 179 Coverage File Exists! Added Coverage Group Added Metadata Group Added ArrowCoverage Class Added Coverage/Info Added Coverage/Info/CellNames 2025-10-31 18:12:28.908771 : Group BCs._.fetal_retina (3 of 16) : Creating Group Coverage File : BCs._.fetal_retina.insertions.coverage.h5, 1.139 mins elapsed. Number of Cells = 136 Coverage File Exists! Added Coverage Group Added Metadata Group Added ArrowCoverage Class Added Coverage/Info Added Coverage/Info/CellNames 2025-10-31 18:12:57.681518 : Group BCs._.fetal_eye_except_retina (4 of 16) : Creating Group Coverage File : BCs._.fetal_eye_except_retina.insertions.coverage.h5, 1.619 mins elapsed. Number of Cells = 117 Coverage File Exists! Added Coverage Group Added Metadata Group Added ArrowCoverage Class Added Coverage/Info Added Coverage/Info/CellNames 2025-10-31 18:13:25.117063 : Group Cones._.Rep1 (5 of 16) : Creating Group Coverage File : Cones._.Rep1.insertions.coverage.h5, 2.076 mins elapsed. Number of Cells = 40 Coverage File Exists! Added Coverage Group Added Metadata Group Added ArrowCoverage Class Added Coverage/Info Added Coverage/Info/CellNames 2025-10-31 18:13:55.259366 : Group Cones._.Rep2 (6 of 16) : Creating Group Coverage File : Cones._.Rep2.insertions.coverage.h5, 2.578 mins elapsed. Number of Cells = 40 Coverage File Exists! Added Coverage Group Added Metadata Group Added ArrowCoverage Class Added Coverage/Info Added Coverage/Info/CellNames 2025-10-31 18:14:25.564667 : Group Fibroblasts._.fetal_eye_except_retina (7 of 16) : Creating Group Coverage File : Fibroblasts._.fetal_eye_except_retina.insertions.coverage.h5, 3.084 mins elapsed. Number of Cells = 500 Coverage File Exists! Added Coverage Group Added Metadata Group Added ArrowCoverage Class Added Coverage/Info Added Coverage/Info/CellNames 2025-10-31 18:14:56.844514 : Group Fibroblasts._.fetal_retina (8 of 16) : Creating Group Coverage File : Fibroblasts._.fetal_retina.insertions.coverage.h5, 3.605 mins elapsed. Number of Cells = 70 Coverage File Exists! Added Coverage Group Added Metadata Group Added ArrowCoverage Class Added Coverage/Info Added Coverage/Info/CellNames 2025-10-31 18:15:23.53514 : Group Microphages._.Rep1 (9 of 16) : Creating Group Coverage File : Microphages._.Rep1.insertions.coverage.h5, 4.05 mins elapsed. Number of Cells = 266 Coverage File Exists! Added Coverage Group Added Metadata Group Added ArrowCoverage Class Added Coverage/Info Added Coverage/Info/CellNames 2025-10-31 18:15:51.290942 : Group Microphages._.Rep2 (10 of 16) : Creating Group Coverage File : Microphages._.Rep2.insertions.coverage.h5, 4.512 mins elapsed. Number of Cells = 40 Coverage File Exists! Added Coverage Group Added Metadata Group Added ArrowCoverage Class Added Coverage/Info Added Coverage/Info/CellNames 2025-10-31 18:16:20.576663 : Group Rods._.fetal_retina (11 of 16) : Creating Group Coverage File : Rods._.fetal_retina.insertions.coverage.h5, 5 mins elapsed. Number of Cells = 500 Coverage File Exists! Added Coverage Group Added Metadata Group Added ArrowCoverage Class Added Coverage/Info Added Coverage/Info/CellNames 2025-10-31 18:16:54.15746 : Group Rods._.fetal_eye_except_retina (12 of 16) : Creating Group Coverage File : Rods._.fetal_eye_except_retina.insertions.coverage.h5, 5.56 mins elapsed. Number of Cells = 435 Coverage File Exists! Added Coverage Group Added Metadata Group Added ArrowCoverage Class Added Coverage/Info Added Coverage/Info/CellNames 2025-10-31 18:17:24.060377 : Group RPCs._.Rep1 (13 of 16) : Creating Group Coverage File : RPCs._.Rep1.insertions.coverage.h5, 6.058 mins elapsed. Number of Cells = 134 Coverage File Exists! Added Coverage Group Added Metadata Group Added ArrowCoverage Class Added Coverage/Info Added Coverage/Info/CellNames 2025-10-31 18:17:50.501192 : Group RPCs._.Rep2 (14 of 16) : Creating Group Coverage File : RPCs._.Rep2.insertions.coverage.h5, 6.499 mins elapsed. Number of Cells = 40 Coverage File Exists! Added Coverage Group Added Metadata Group Added ArrowCoverage Class Added Coverage/Info Added Coverage/Info/CellNames 2025-10-31 18:18:19.122296 : Group RPE._.fetal_eye_except_retina (15 of 16) : Creating Group Coverage File : RPE._.fetal_eye_except_retina.insertions.coverage.h5, 6.976 mins elapsed. Number of Cells = 500 Coverage File Exists! Added Coverage Group Added Metadata Group Added ArrowCoverage Class Added Coverage/Info Added Coverage/Info/CellNames 2025-10-31 18:18:48.850879 : Group RPE._.fetal_retina (16 of 16) : Creating Group Coverage File : RPE._.fetal_retina.insertions.coverage.h5, 7.472 mins elapsed. Number of Cells = 69 Coverage File Exists! Added Coverage Group Added Metadata Group Added ArrowCoverage Class Added Coverage/Info Added Coverage/Info/CellNames 2025-10-31 18:19:14.971433 : Adding Kmer Bias to Coverage Files!, 7.907 mins elapsed.
Output:
R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call
Output:
Completed Kmer Bias Calculation Adding Kmer Bias (1 of 16) Adding Kmer Bias (2 of 16) Adding Kmer Bias (3 of 16) Adding Kmer Bias (4 of 16) Adding Kmer Bias (5 of 16) Adding Kmer Bias (6 of 16) Adding Kmer Bias (7 of 16) Adding Kmer Bias (8 of 16) Adding Kmer Bias (9 of 16) Adding Kmer Bias (10 of 16) Adding Kmer Bias (11 of 16) Adding Kmer Bias (12 of 16) Adding Kmer Bias (13 of 16) Adding Kmer Bias (14 of 16) Adding Kmer Bias (15 of 16) Adding Kmer Bias (16 of 16) 2025-10-31 18:20:10.540152 : Finished Creation of Coverage Files!, 8.833 mins elapsed. ArchR logging successful to : ArchRLogs/ArchR-addGroupCoverages-91a856f6d087-Date-2025-10-31_Time-18-11-20.553077.log
code cell
In [78]:
projHeme4@projectMetadata$GroupCoverages$Clusters2$coverageMetadata
Result:
DataFrame with 16 rows and 5 columns Group Name File nCells <character> <character> <character> <integer> 1 ACs ACs._.fetal_eye_exce.. /share2/pub/zhouyj/z.. 500 2 ACs ACs._.fetal_retina /share2/pub/zhouyj/z.. 179 3 BCs BCs._.fetal_retina /share2/pub/zhouyj/z.. 136 4 BCs BCs._.fetal_eye_exce.. /share2/pub/zhouyj/z.. 117 5 Cones Cones._.Rep1 /share2/pub/zhouyj/z.. 40 ... ... ... ... ... 12 Rods Rods._.fetal_eye_exc.. /share2/pub/zhouyj/z.. 435 13 RPCs RPCs._.Rep1 /share2/pub/zhouyj/z.. 134 14 RPCs RPCs._.Rep2 /share2/pub/zhouyj/z.. 40 15 RPE RPE._.fetal_eye_exce.. /share2/pub/zhouyj/z.. 500 16 RPE RPE._.fetal_retina /share2/pub/zhouyj/z.. 69 nInsertions <numeric> 1 4440076 2 2864548 3 4201458 4 1642106 5 2234234 ... ... 12 10706980 13 825818 14 200906 15 10886846 16 2480500
code cell
In [79]:
groups <- addGroupCoverages(ArchRProj = projHeme3, groupBy = "Clusters2", returnGroups = TRUE)
Output:
ArchR logging to : ArchRLogs/ArchR-addGroupCoverages-91a848f6d48c-Date-2025-10-31_Time-18-49-00.131534.log If there is an issue, please report to github with logFile! ACs (1 of 8) : CellGroups N = 2 BCs (2 of 8) : CellGroups N = 2 Cones (3 of 8) : CellGroups N = 2 Fibroblasts (4 of 8) : CellGroups N = 2 Microphages (5 of 8) : CellGroups N = 2 Rods (6 of 8) : CellGroups N = 2 RPCs (7 of 8) : CellGroups N = 2 RPE (8 of 8) : CellGroups N = 2
markdown cell

10. Calling Peaks

code cell
In [81]:
pathToMacs2 <- findMacs2() pathToMacs2
Output:
Searching For MACS2.. Found with $PATH at /share/pub/zhouyj/anaconda3/envs/scRNA/bin/macs2
'macs2'
code cell
In [82]:
projHeme4 <- addReproduciblePeakSet( ArchRProj = projHeme4, groupBy = "Clusters2", pathToMacs2 = pathToMacs2 )
Output:
ArchR logging to : ArchRLogs/ArchR-addReproduciblePeakSet-91a81db437b6-Date-2025-10-31_Time-19-02-54.509429.log If there is an issue, please report to github with logFile! Calling Peaks with Macs2 2025-10-31 19:02:56.815876 : Peak Calling Parameters!, 0.038 mins elapsed.
Output:
Group nCells nCellsUsed nReplicates nMin nMax maxPeaks ACs ACs 861 679 2 179 500 150000 BCs BCs 253 253 2 117 136 126500 Cones Cones 69 62 2 40 40 31000 Fibroblasts Fibroblasts 2859 570 2 70 500 150000 Microphages Microphages 306 306 2 40 266 150000 Rods Rods 1041 935 2 435 500 150000 RPCs RPCs 174 174 2 40 134 87000 RPE RPE 622 569 2 69 500 150000
Output:
2025-10-31 19:02:56.825329 : Batching Peak Calls!, 0.039 mins elapsed. 2025-10-31 19:02:56.83453 : Batch Execution w/ safelapply!, 0 mins elapsed. 2025-10-31 19:04:29.351633 : Identifying Reproducible Peaks!, 1.581 mins elapsed.
Output:
ted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call R_zmq_msg_send errno: 4 strerror: Interrupted system call
Output:
2025-10-31 19:11:29.996123 : Creating Union Peak Set!, 8.591 mins elapsed. Converged after 6 iterations! Plotting Ggplot! 2025-10-31 19:22:42.27675 : Finished Creating Union Peak Set (129092)!, 19.796 mins elapsed.
code cell
In [83]:
list.files(path=paste0(getOutputDirectory(ArchRProj = projHeme4),"/PeakCalls"))
'Clusters2'
code cell
In [84]:
getPeakSet(projHeme4)
Result:
GRanges object with 129092 ranges and 13 metadata columns: seqnames ranges strand | score <Rle> <IRanges> <Rle> | <numeric> BCs chr1 794305-794805 * | 6.39722 Fibroblasts chr1 826534-827034 * | 13.74890 Rods chr1 827309-827809 * | 157.73900 RPCs chr1 850196-850696 * | 3.37943 Rods chr1 869681-870181 * | 257.91700 ... ... ... ... . ... Microphages chrX 155632362-155632862 * | 6.60554 Rods chrX 155767346-155767846 * | 33.39460 Rods chrX 155820060-155820560 * | 28.29440 BCs chrX 155880528-155881028 * | 5.89683 Cones chrX 155881055-155881555 * | 24.78860 replicateScoreQuantile groupScoreQuantile Reproducibility <numeric> <numeric> <numeric> BCs 0.645 0.303 2 Fibroblasts 0.569 0.199 2 Rods 0.907 0.856 2 RPCs 0.586 0.193 2 Rods 0.897 0.841 2 ... ... ... ... Microphages 0.618 0.195 2 Rods 0.743 0.602 2 Rods 0.606 0.392 2 BCs 0.358 0.081 2 Cones 0.915 0.859 2 GroupReplicate distToGeneStart nearestGene peakType <character> <integer> <character> <character> BCs BCs._.fetal_eye_exce.. 22815 FAM87B Distal Fibroblasts Fibroblasts._.fetal_.. 737 LINC00115 Exonic Rods Rods._.fetal_eye_exc.. 36 LINC00115 Promoter RPCs RPCs._.Rep2 22923 LINC00115 Intronic Rods Rods._.fetal_retina 6971 FAM41C Intronic ... ... ... ... ... Microphages Microphages._.Rep1 37331 TMLHE Intronic Rods Rods._.fetal_eye_exc.. 97651 TMLHE Distal Rods Rods._.fetal_retina 150365 TMLHE Distal BCs BCs._.fetal_retina 210833 TMLHE Distal Cones Cones._.Rep2 211360 TMLHE Distal distToTSS nearestTSS GC idx N <integer> <character> <numeric> <integer> <numeric> BCs 8754 uc057awx.1 0.4810 1 0 Fibroblasts 47 uc057axb.1 0.5150 2 0 Rods 36 uc010nxx.3 0.6966 3 0 RPCs 901 uc057axk.1 0.4731 4 0 Rods 269 uc057axl.1 0.7405 5 0 ... ... ... ... ... ... Microphages 19675 uc004fnn.5 0.4271 2466 0 Rods 215 uc004fnq.2 0.5409 2467 0 Rods 52497 uc004fnq.2 0.4671 2468 0 BCs 514 uc004fnr.4 0.4212 2469 0 Cones 11 uc004fnr.4 0.6287 2470 0 ------- seqinfo: 23 sequences from an unspecified genome; no seqlengths
code cell
In [85]:
projHeme4 <- saveArchRProject(ArchRProj = projHeme4, outputDirectory = "Save-ProjHeme4", load = TRUE)
Output:
Copying ArchRProject to new outputDirectory : /share2/pub/zhouyj/zhouyj/Human_eye_atlas/analysis/demo/Save-ProjHeme4 Copying Arrow Files... Copying Arrow Files (1 of 2) Copying Arrow Files (2 of 2) Getting ImputeWeights No imputeWeights found, returning NULL Copying Other Files... Copying Other Files (1 of 6): Embeddings Copying Other Files (2 of 6): GroupCoverages Copying Other Files (3 of 6): IterativeLSI Copying Other Files (4 of 6): PeakCalls Copying Other Files (5 of 6): Plots Copying Other Files (6 of 6): RNAIntegration Saving ArchRProject... Loading ArchRProject... Successfully loaded ArchRProject! / | / \ . / |. \\\ / |. \\\ / `|. \\\ / |. \ / |\ \\#####\ / || ==###########> / || \\##==......\ / || ______ = =|__ /__ || \\\ ,--' ,----`-,__ ___/' --,-`-===================##========> \ ' ##_______ _____ ,--,__,=##,__ /// , __== ___,-,__,--'#' ===' `-' | ##,-/ -,____,---' \\####\\________________,--\\_##,/ ___ .______ ______ __ __ .______ / \ | _ \ / || | | | | _ \ / ^ \ | |_) | | ,----'| |__| | | |_) | / /_\ \ | / | | | __ | | / / _____ \ | |\ \\___ | `----.| | | | | |\ \\___. /__/ \__\ | _| `._____| \______||__| |__| | _| `._____|
code cell
In [86]:
projHeme5 <- addPeakMatrix(projHeme4) getOutputDirectory(ArchRProj = projHeme5) getAvailableMatrices(projHeme5)
Output:
ArchR logging to : ArchRLogs/ArchR-addPeakMatrix-91a8539812a2-Date-2025-10-31_Time-19-22-50.804515.log If there is an issue, please report to github with logFile! 2025-10-31 19:22:55.422745 : Batch Execution w/ safelapply!, 0 mins elapsed. ArchR logging successful to : ArchRLogs/ArchR-addPeakMatrix-91a8539812a2-Date-2025-10-31_Time-19-22-50.804515.log
'/share2/pub/zhouyj/zhouyj/Human_eye_atlas/analysis/demo/Save-ProjHeme4'
  1. 'GeneIntegrationMatrix'
  2. 'GeneScoreMatrix'
  3. 'PeakMatrix'
  4. 'TileMatrix'
code cell
In [87]:
rm(projHeme4) gc()
A matrix: 2 × 6 of type dbl
used(Mb)gc trigger(Mb)max used(Mb)
Ncells 12339869 659.1 21390551 1142.4 21390551 1142.4
Vcells6460434074929.0184841831814102.4177788051513564.2
markdown cell

11. Identifying Marker Peaks

code cell
In [88]:
#Identify #Between groups by set useGroups and bgdGroups markerPeaks <- getMarkerFeatures( ArchRProj = projHeme5, useMatrix = "PeakMatrix", groupBy = "Clusters2", bias = c("TSSEnrichment", "log10(nFrags)"), testMethod = "wilcoxon" )
Output:
ArchR logging to : ArchRLogs/ArchR-getMarkerFeatures-91a823dd57ca-Date-2025-10-31_Time-19-24-19.361271.log If there is an issue, please report to github with logFile! MatrixClass = Sparse.Integer.Matrix 2025-10-31 19:24:19.984723 : Matching Known Biases, 0.004 mins elapsed. ########### 2025-10-31 19:25:00.011829 : Completed Pairwise Tests, 0.671 mins elapsed. ########### ArchR logging successful to : ArchRLogs/ArchR-getMarkerFeatures-91a823dd57ca-Date-2025-10-31_Time-19-24-19.361271.log
code cell
In [89]:
markerPeaks
Result:
class: SummarizedExperiment dim: 129092 8 metadata(2): MatchInfo Params assays(7): Log2FC Mean ... AUC MeanBGD rownames(129092): 1 2 ... 129091 129092 rowData names(4): seqnames idx start end colnames(8): ACs BCs ... RPCs RPE colData names(0):
code cell
In [109]:
markerList <- getMarkers(markerPeaks, cutOff = "FDR <= 0.05 & Log2FC >= 1") head(markerList$ACs)
Result:
DataFrame with 6 rows and 7 columns seqnames idx start end Log2FC FDR MeanDiff <Rle> <integer> <integer> <integer> <numeric> <numeric> <numeric> 2757 chr1 2757 26542185 26542685 1.19779 0.00135953 0.238590 81998 chr22 2747 46674339 46674839 6.88919 0.00135953 0.236970 12152 chr1 12152 236065000 236065500 1.61085 0.00161278 0.477419 56006 chr18 104 3177818 3178318 7.45332 0.00181553 0.351322 60755 chr19 2136 16258413 16258913 6.69128 0.00247413 0.206335 105726 chr6 3975 89035754 89036254 1.05297 0.00339361 0.201822
code cell
In [91]:
#Plotting marker peaks heatmapPeaks <- markerHeatmap( seMarker = markerPeaks, cutOff = "FDR <= 0.1 & Log2FC >= 0.5", transpose = TRUE )
Output:
Warning message: “'markerHeatmap' is deprecated. Use 'plotMarkerHeatmap' instead. See help("Deprecated")” ArchR logging to : ArchRLogs/ArchR-plotMarkerHeatmap-91a87adddb1c-Date-2025-10-31_Time-19-25-03.501813.log If there is an issue, please report to github with logFile! Identified 53861 markers!
Output:
[1] "chr1:15345799-15346299" "chr1:22652905-22653405" [3] "chr1:22662317-22662817" "chr1:25247315-25247815" [5] "chr1:26542185-26542685" "chr1:30756609-30757109" [7] "chr1:31789140-31789640" "chr1:32981000-32981500" [9] "chr1:40385615-40386115" "chr1:40653829-40654329" [11] "chr1:44031289-44031789" "chr1:50251731-50252231" [13] "chr1:59292570-59293070" "chr1:81800172-81800672" [15] "chr1:85201486-85201986" "chr1:1426870-1427370" [17] "chr1:3529302-3529802" "chr1:4760380-4760880" [19] "chr1:5287664-5288164" "chr1:6145100-6145600" [21] "chr1:7052660-7053160" "chr1:7427476-7427976" [23] "chr1:9883898-9884398" "chr1:18134656-18135156" [25] "chr1:18142896-18143396" "chr1:18182749-18183249" [27] "chr1:18190460-18190960" "chr1:19028006-19028506" [29] "chr1:19640442-19640942" "chr1:19730618-19731118" [31] "chr12:32877591-32878091" "chr5:168291370-168291870" [33] "chr1:39562818-39563318" "chr1:180188651-180189151" [35] "chr1:201941573-201942073" "chr1:234779538-234780038" [37] "chr10:118965616-118966116" "chr12:3139801-3140301" [39] "chr12:109715428-109715928" "chr14:66649464-66649964" [41] "chr14:92886342-92886842" "chr14:99540242-99540742" [43] "chr15:34754777-34755277" "chr16:20866167-20866667" [45] "chr16:85431694-85432194" "chr1:916454-916954" [47] "chr1:940711-941211" "chr1:1013184-1013684" [49] "chr1:1014283-1014783" "chr1:1019878-1020378" [51] "chr1:1349671-1350171" "chr1:1352586-1353086" [53] "chr1:1356999-1357499" "chr1:1357783-1358283" [55] "chr1:1358361-1358861" "chr1:1359159-1359659" [57] "chr1:1359675-1360175" "chr1:1360521-1361021" [59] "chr1:1361929-1362429" "chr1:1435436-1435936" [61] "chr1:1433751-1434251" "chr1:1713451-1713951" [63] "chr1:1858593-1859093" "chr1:2148658-2149158" [65] "chr1:2255410-2255910" "chr1:2256059-2256559" [67] "chr1:2548636-2549136" "chr1:2578271-2578771" [69] "chr1:5698062-5698562" "chr1:6935302-6935802" [71] "chr1:7491405-7491905" "chr1:7543022-7543522" [73] "chr1:7648968-7649468" "chr1:7760222-7760722" [75] "chr1:9239591-9240091" "chr1:869681-870181" [77] "chr1:909902-910402" "chr1:911146-911646" [79] "chr1:911996-912496" "chr1:932831-933331" [81] "chr1:935223-935723" "chr1:935772-936272" [83] "chr1:938022-938522" "chr1:939048-939548" [85] "chr1:939567-940067" "chr1:941591-942091" [87] "chr1:942299-942799" "chr1:945174-945674" [89] "chr1:947850-948350" "chr1:951304-951804" [91] "chr1:924788-925288" "chr1:925411-925911" [93] "chr1:933396-933896" "chr1:991895-992395" [95] "chr1:1025031-1025531" "chr1:1059407-1059907" [97] "chr1:1079460-1079960" "chr1:1158106-1158606" [99] "chr1:1422879-1423379" "chr1:1433030-1433530" [101] "chr1:1599206-1599706" "chr1:1891492-1891992" [103] "chr1:2025415-2025915" "chr1:2248828-2249328" [105] "chr1:2257264-2257764" "chr1:2481980-2482480" [107] "chr1:2527481-2527981"
Output:
Adding Annotations.. Preparing Main Heatmap.. ArchR logging successful to : ArchRLogs/ArchR-plotMarkerHeatmap-91a87adddb1c-Date-2025-10-31_Time-19-25-03.501813.log
code cell
In [92]:
draw(heatmapPeaks, heatmap_legend_side = "bot", annotation_legend_side = "bot")
Image output
code cell
In [110]:
#Volcano plots pma <- plotMarkers(seMarker = markerPeaks, name = "ACs", cutOff = "FDR <= 0.1 & Log2FC >= 1", plotAs = "MA") pma pv <- plotMarkers(seMarker = markerPeaks, name = "ACs", cutOff = "FDR <= 0.1 & Log2FC >= 1", plotAs = "Volcano") pv
Output:
Warning message: “Removed 28 rows containing missing values or values outside the scale range (`geom_point_rast()`).” Warning message: “Removed 28 rows containing missing values or values outside the scale range (`geom_point_rast()`).”
Image output Image output
code cell
In [118]:
#Marker Peaks in Browser Tracks #markerGenes <- c( # "CCND1", ##RPCs # "ATOH7", ##NPCs # "GAP43", ##RGCs # "ONECUT1", ##HCs # "TFAP2A", ##ACs # "PRDM1", ##Cones # "NRL" ##Rods # ) p16 <- plotBrowserTrack( ArchRProj = projHeme5, groupBy = "Clusters2", geneSymbol = c('CCND1'), features = getMarkers(markerPeaks, cutOff = "FDR <= 0.1 & Log2FC >= 1", returnGR = TRUE)[['RPCs']], upstream = 50000, downstream = 50000 ) plotPDF(p16, name = "Plot-Tracks-With-Features", width = 5, height = 5, ArchRProj = projHeme5, addDOC = FALSE)
Output:
ArchR logging to : ArchRLogs/ArchR-plotBrowserTrack-91a82ef79af8-Date-2025-10-31_Time-19-42-43.092084.log If there is an issue, please report to github with logFile! 2025-10-31 19:42:43.416677 : Validating Region, 0.005 mins elapsed.
Output:
GRanges object with 1 range and 2 metadata columns: seqnames ranges strand | gene_id symbol <Rle> <IRanges> <Rle> | <character> <character> [1] chr11 69641087-69654474 + | 595 CCND1 ------- seqinfo: 24 sequences from hg38 genome
Output:
2025-10-31 19:42:43.514388 : Adding Bulk Tracks (1 of 1), 0.007 mins elapsed. 2025-10-31 19:42:44.557288 : Adding Feature Tracks (1 of 1), 0.024 mins elapsed. 2025-10-31 19:42:44.622154 : Adding Gene Tracks (1 of 1), 0.026 mins elapsed. 2025-10-31 19:42:44.850726 : Plotting, 0.029 mins elapsed. ArchR logging successful to : ArchRLogs/ArchR-plotBrowserTrack-91a82ef79af8-Date-2025-10-31_Time-19-42-43.092084.log Plotting Gtable!
Output:
NULL _msg_send errno: 4 strerror: Interrupted system call
code cell
In [119]:
grid::grid.draw(p16$CCND1)
Image output
markdown cell

12. Motif and Feature Enrichment

code cell
In [120]:
projHeme5 <- addMotifAnnotations(ArchRProj = projHeme5, motifSet = "cisbp", name = "Motif")
Output:
ArchR logging to : ArchRLogs/ArchR-addMotifAnnotations-91a8121a56c-Date-2025-10-31_Time-19-43-03.340663.log If there is an issue, please report to github with logFile! 2025-10-31 19:43:08.924174 : Getting Motif Set, Species : Homo sapiens, 0.005 mins elapsed. Using version 2 motifs! 2025-10-31 19:43:10.724987 : Finding Motif Positions with motifmatchr!, 0.035 mins elapsed. 2025-10-31 19:53:01.423081 : All Motifs Overlap at least 1 peak!, 9.88 mins elapsed. 2025-10-31 19:53:01.431496 : Creating Motif Overlap Matrix, 9.88 mins elapsed. 2025-10-31 19:53:03.098856 : Finished Getting Motif Info!, 9.908 mins elapsed. ArchR logging successful to : ArchRLogs/ArchR-addMotifAnnotations-91a8121a56c-Date-2025-10-31_Time-19-43-03.340663.log
code cell
In [121]:
#Find motif for specific gene #Make chr:star-end for peaks pSet <- getPeakSet(ArchRProj = projHeme5) pSet$name <- paste(seqnames(pSet), start(pSet), end(pSet), sep = "_")
code cell
In [122]:
#Find matched motifs matches <- getMatches(ArchRProj = projHeme5, name = "Motif") rownames(matches) <- paste(seqnames(matches), start(matches), end(matches), sep = "_") matches <- matches[pSet$name]
code cell
In [123]:
#Using CEBPA as example gr <- GRanges(seqnames = c("chr19"), ranges = IRanges(start = c(33792929), end = c(33794030)))
code cell
In [124]:
queryHits <- queryHits(findOverlaps(query = pSet, subject = gr, type = "within")) colnames(matches)[which(assay(matches[queryHits,]))]
code cell
In [126]:
#Motif Enrichment in Marker Peaks enrichMotifs <- peakAnnoEnrichment( seMarker = markerPeaks, ArchRProj = projHeme5, peakAnnotation = "Motif", cutOff = "FDR <= 0.1 & Log2FC >= 0.5" )
Output:
ArchR logging to : ArchRLogs/ArchR-peakAnnoEnrichment-91a825183bce-Date-2025-10-31_Time-19-58-56.148821.log If there is an issue, please report to github with logFile! 2025-10-31 19:59:06.337613 : Computing Enrichments 1 of 8, 0.17 mins elapsed. 2025-10-31 19:59:06.456576 : Computing Enrichments 2 of 8, 0.172 mins elapsed. 2025-10-31 19:59:06.586954 : Computing Enrichments 3 of 8, 0.174 mins elapsed. 2025-10-31 19:59:06.712253 : Computing Enrichments 4 of 8, 0.176 mins elapsed. 2025-10-31 19:59:06.840202 : Computing Enrichments 5 of 8, 0.178 mins elapsed. 2025-10-31 19:59:06.970618 : Computing Enrichments 6 of 8, 0.18 mins elapsed. 2025-10-31 19:59:07.11032 : Computing Enrichments 7 of 8, 0.183 mins elapsed. 2025-10-31 19:59:07.233069 : Computing Enrichments 8 of 8, 0.185 mins elapsed. ArchR logging successful to : ArchRLogs/ArchR-peakAnnoEnrichment-91a825183bce-Date-2025-10-31_Time-19-58-56.148821.log
code cell
In [127]:
heatmapEM <- plotEnrichHeatmap(enrichMotifs, n = 7, transpose = TRUE)
Output:
ArchR logging to : ArchRLogs/ArchR-plotEnrichHeatmap-91a82c8112e1-Date-2025-10-31_Time-19-59-07.637417.log If there is an issue, please report to github with logFile! Adding Annotations.. Preparing Main Heatmap..
code cell
In [128]:
#Plotting motif logos pwm <- getPeakAnnotation(projHeme5, "Motif")$motifs[["SOX6_868"]] pwm
Result:
An object of class PWMatrix ID: ENSG00000110693_LINE19574_SOX6_I_N7 Name: SOX6 Matrix Class: Unknown strand: * Pseudocounts: Tags: $ensembl [1] "ENSG00000110693" Background: A C G T 0.25 0.25 0.25 0.25 Matrix: [,1] [,2] [,3] [,4] [,5] [,6] [,7] A 0.2856363 1.236986 1.297203 -2.131381 1.340070 1.259222 -1.224160 C -0.6196693 -2.173286 -3.473518 1.257839 -3.473518 -1.259465 -1.366327 G -0.4525775 -1.628926 -1.447733 -1.817039 -2.592395 -1.817039 -3.473518 T 0.4023151 -1.407137 -2.592395 -1.604399 -2.592395 -3.473518 1.229625 [,8] A 0.4500618 C -0.4283456 G 0.2154626 T -0.6169845
code cell
In [129]:
PWMatrixToProbMatrix <- function(x){ if (class(x) != "PWMatrix") stop("x must be a TFBSTools::PWMatrix object") m <- (exp(as(x, "matrix"))) * TFBSTools::bg(x)/sum(TFBSTools::bg(x)) m <- t(t(m)/colSums(m)) m } ppm <- PWMatrixToProbMatrix(pwm) ppm
A matrix: 4 × 8 of type dbl
A0.33265210.861303410.9147622770.029668340.9548277870.8806701000.0735011360.3921023
C0.13453060.028450760.0077519380.879452520.0077519380.0709514250.0637605140.1628965
G0.15899670.049035030.0587756470.040626540.0187101380.0406265370.0077519380.3101089
T0.37382060.061210800.0187101380.050252600.0187101380.0077519380.8549864120.1348923
code cell
In [130]:
library(ggseqlogo) colSums(ppm) %>% range ggseqlogo(ppm, method = "bits")
Output:
Warning message: “package ‘ggseqlogo’ was built under R version 4.3.3”
  1. 1
  2. 1
Image output
code cell
In [131]:
#For more customizable enrichments, see 14.2 in ArchR website
markdown cell

13. ChromVAR Deviatons Enrichment

code cell
In [142]:
if("Motif" %ni% names(projHeme5@peakAnnotation)){ projHeme5 <- addMotifAnnotations(ArchRProj = projHeme5, motifSet = "cisbp", name = "Motif") }
code cell
In [143]:
projHeme5 <- addBgdPeaks(projHeme5)
Output:
Identifying Background Peaks!
code cell
In [144]:
projHeme5 <- addDeviationsMatrix( ArchRProj = projHeme5, peakAnnotation = "Motif", force = TRUE )
Output:
Using Previous Background Peaks! ArchR logging to : ArchRLogs/ArchR-addDeviationsMatrix-91a83547fae9-Date-2025-10-31_Time-20-06-56.868617.log If there is an issue, please report to github with logFile! 2025-10-31 20:07:00.761119 : Batch Execution w/ safelapply!, 0 mins elapsed. ########### 2025-10-31 20:15:20.370548 : Completed Computing Deviations!, 8.392 mins elapsed. ########### ArchR logging successful to : ArchRLogs/ArchR-addDeviationsMatrix-91a83547fae9-Date-2025-10-31_Time-20-06-56.868617.log
code cell
In [145]:
plotVarDev <- getVarDeviations(projHeme5, name = "MotifMatrix", plot = TRUE)
Output:
DataFrame with 6 rows and 6 columns seqnames idx name combinedVars combinedMeans rank <Rle> <integer> <character> <numeric> <numeric> <integer> f470 z 470 GSC_470 22.4508 -0.442743 1 f402 z 402 GSC2_402 21.3914 -0.438434 2 f560 z 560 DMBX1_560 21.2762 -0.428337 3 f426 z 426 PITX3_426 20.4690 -0.416372 4 f565 z 565 DPRX_565 19.8257 -0.419043 5 f404 z 404 PITX1_404 16.6746 -0.390313 6
code cell
In [146]:
plotVarDev
Output:
Warning message: “ggrepel: 7 unlabeled data points (too many overlaps). Consider increasing max.overlaps”
Image output
code cell
In [147]:
motifs <- c( "CCND1", ##RPCs "ATOH7", ##NPCs "GAP43", ##RGCs "ONECUT1", ##HCs "TFAP2A", ##ACs "PRDM1", ##Cones "NRL" ##Rods ) markerMotifs <- getFeatures(projHeme5, select = paste(motifs, collapse="|"), useMatrix = "MotifMatrix") markerMotifs
  1. 'z:ONECUT1_295'
  2. 'z:PRDM16_211'
  3. 'z:PRDM1_163'
  4. 'z:NRL_123'
  5. 'z:ATOH7_84'
  6. 'z:TFAP2A_5'
  7. 'deviations:ONECUT1_295'
  8. 'deviations:PRDM16_211'
  9. 'deviations:PRDM1_163'
  10. 'deviations:NRL_123'
  11. 'deviations:ATOH7_84'
  12. 'deviations:TFAP2A_5'
code cell
In [148]:
#Remove extra motifs markerMotifs <- grep("z:", markerMotifs, value = TRUE) markerMotifs <- markerMotifs[markerMotifs %ni% c("z:PRDM16_211",'z:PRDM16_211')] markerMotifs
  1. 'z:ONECUT1_295'
  2. 'z:PRDM1_163'
  3. 'z:NRL_123'
  4. 'z:ATOH7_84'
  5. 'z:TFAP2A_5'
code cell
In [149]:
pD <- plotGroups(ArchRProj = projHeme5, groupBy = "Clusters2", colorBy = "MotifMatrix", name = markerMotifs, imputeWeights = getImputeWeights(projHeme5) )
Output:
Getting ImputeWeights No imputeWeights found, returning NULL Getting Matrix Values... 2025-10-31 20:19:20.371884 : 1 2 3 4 5
code cell
In [152]:
pD2 <- lapply(seq_along(pD), function(x){ if(x != 1){ pD[[x]] + guides(color = "none", fill = "none") + theme_ArchR(baseSize = 6) + theme(plot.margin = unit(c(0.1, 0.1, 0.1, 0.1), "cm")) + theme( axis.text.y=element_blank(), axis.ticks.y=element_blank(), axis.title.y=element_blank() ) + ylab("") }else{ pD[[x]] + guides(color = "none", fill = "none") + theme_ArchR(baseSize = 6) + theme(plot.margin = unit(c(0.1, 0.1, 0.1, 0.1), "cm")) + theme( axis.ticks.y=element_blank(), axis.title.y=element_blank() ) + ylab("") } }) do.call(cowplot::plot_grid, c(list(nrow = 1, rel_widths = c(2, rep(1, length(pD2) - 1))),pD2))
Output:
Picking joint bandwidth of 0.266 Picking joint bandwidth of 0.32 Picking joint bandwidth of 0.274 Picking joint bandwidth of 0.362 Picking joint bandwidth of 0.266
Image output
code cell
In [153]:
#Plot in UMAP pE <- plotEmbedding( ArchRProj = projHeme5, colorBy = "MotifMatrix", name = sort(markerMotifs), embedding = "UMAP", imputeWeights = getImputeWeights(projHeme5) )
Output:
Getting ImputeWeights No imputeWeights found, returning NULL ArchR logging to : ArchRLogs/ArchR-plotEmbedding-91a843902a36-Date-2025-10-31_Time-20-20-00.828931.log If there is an issue, please report to github with logFile! Getting UMAP Embedding ColorBy = MotifMatrix Getting Matrix Values... 2025-10-31 20:20:01.524164 : Plotting Embedding 1 2 3 4 5 ArchR logging successful to : ArchRLogs/ArchR-plotEmbedding-91a843902a36-Date-2025-10-31_Time-20-20-00.828931.log
code cell
In [154]:
pE2 <- lapply(pE, function(x){ x + guides(color = "none", fill = "none") + theme_ArchR(baseSize = 6.5) + theme(plot.margin = unit(c(0, 0, 0, 0), "cm")) + theme( axis.text.x=element_blank(), axis.ticks.x=element_blank(), axis.text.y=element_blank(), axis.ticks.y=element_blank() ) }) do.call(cowplot::plot_grid, c(list(ncol = 3),pE2))
Image output
code cell
#Plot by gene expression scores markerGS <- getFeatures(projHeme5, select = paste(motifs, collapse="|"), useMatrix = "GeneScoreMatrix") markerGS <- markerGS[markerGS %in% motifs] markerGS
  1. 'ATOH7'
  2. 'CCND1'
  3. 'NRL'
  4. 'ONECUT1'
  5. 'GAP43'
  6. 'TFAP2A'
  7. 'PRDM1'
code cell
pG <- plotEmbedding( ArchRProj = projHeme5, colorBy = "GeneScoreMatrix", name = sort(markerGS), embedding = "UMAP", imputeWeights = getImputeWeights(projHeme5) )
code cell
pG2 <- lapply(pG, function(x){ x + guides(color = "none", fill = "none") + theme_ArchR(baseSize = 6.5) + theme(plot.margin = unit(c(0, 0, 0, 0), "cm")) + theme( axis.text.x=element_blank(), axis.ticks.x=element_blank(), axis.text.y=element_blank(), axis.ticks.y=element_blank() ) }) do.call(cowplot::plot_grid, c(list(ncol = 3),pG2))
code cell
#Plot by linked gene expression markerRNA <- getFeatures(projHeme5, select = paste(motifs, collapse="|"), useMatrix = "GeneIntegrationMatrix") markerRNA <- markerRNA[markerRNA %in% motifs] markerRNA
code cell
pl <- plotEmbedding( ArchRProj = projHeme5, colorBy = "GeneIntegrationMatrix", name = sort(markerRNA), embedding = "UMAP", continuousSet = "blueYellow", imputeWeights = getImputeWeights(projHeme5) )
code cell
pl2 <- lapply(pl, function(x){ x + guides(color = "none", fill = "none") + theme_ArchR(baseSize = 6.5) + theme(plot.margin = unit(c(0, 0, 0, 0), "cm")) + theme( axis.text.x=element_blank(), axis.ticks.x=element_blank(), axis.text.y=element_blank(), axis.ticks.y=element_blank() ) }) do.call(cowplot::plot_grid, c(list(ncol = 3),pl2))
markdown cell

14. Integrative Analysis

code cell
In [132]:
projHeme5 <- addCoAccessibility( ArchRProj = projHeme5, reducedDims = "IterativeLSI" )
Output:
ArchR logging to : ArchRLogs/ArchR-addCoAccessibility-91a8736e9148-Date-2025-10-31_Time-19-59-18.421382.log If there is an issue, please report to github with logFile! 2025-10-31 19:59:19.637594 : Computing KNN, 0.02 mins elapsed. 2025-10-31 19:59:19.764566 : Identifying Non-Overlapping KNN pairs, 0.022 mins elapsed. 2025-10-31 19:59:22.033373 : Identified 475 Groupings!, 0.06 mins elapsed. 2025-10-31 19:59:27.011613 : Computing Co-Accessibility chr1 (1 of 23), 0.143 mins elapsed. 2025-10-31 19:59:33.336164 : Computing Co-Accessibility chr2 (2 of 23), 0.249 mins elapsed. 2025-10-31 19:59:38.255038 : Computing Co-Accessibility chr3 (3 of 23), 0.331 mins elapsed. 2025-10-31 19:59:42.666384 : Computing Co-Accessibility chr4 (4 of 23), 0.404 mins elapsed. 2025-10-31 19:59:46.506188 : Computing Co-Accessibility chr5 (5 of 23), 0.468 mins elapsed. 2025-10-31 19:59:50.578756 : Computing Co-Accessibility chr6 (6 of 23), 0.536 mins elapsed. 2025-10-31 19:59:54.922091 : Computing Co-Accessibility chr7 (7 of 23), 0.608 mins elapsed. 2025-10-31 19:59:59.131901 : Computing Co-Accessibility chr8 (8 of 23), 0.679 mins elapsed. 2025-10-31 20:00:03.037311 : Computing Co-Accessibility chr9 (9 of 23), 0.744 mins elapsed. 2025-10-31 20:00:07.09436 : Computing Co-Accessibility chr10 (10 of 23), 0.811 mins elapsed. 2025-10-31 20:00:11.137756 : Computing Co-Accessibility chr11 (11 of 23), 0.879 mins elapsed. 2025-10-31 20:00:15.627425 : Computing Co-Accessibility chr12 (12 of 23), 0.953 mins elapsed. 2025-10-31 20:00:19.890201 : Computing Co-Accessibility chr13 (13 of 23), 1.024 mins elapsed. 2025-10-31 20:00:23.276465 : Computing Co-Accessibility chr14 (14 of 23), 1.081 mins elapsed. 2025-10-31 20:00:27.039278 : Computing Co-Accessibility chr15 (15 of 23), 1.144 mins elapsed. 2025-10-31 20:00:30.757071 : Computing Co-Accessibility chr16 (16 of 23), 1.206 mins elapsed. 2025-10-31 20:00:34.730788 : Computing Co-Accessibility chr17 (17 of 23), 1.272 mins elapsed. 2025-10-31 20:00:39.286797 : Computing Co-Accessibility chr18 (18 of 23), 1.348 mins elapsed. 2025-10-31 20:00:42.558676 : Computing Co-Accessibility chr19 (19 of 23), 1.402 mins elapsed. 2025-10-31 20:00:47.111523 : Computing Co-Accessibility chr20 (20 of 23), 1.478 mins elapsed. 2025-10-31 20:00:50.670185 : Computing Co-Accessibility chr21 (21 of 23), 1.537 mins elapsed. 2025-10-31 20:00:53.734493 : Computing Co-Accessibility chr22 (22 of 23), 1.589 mins elapsed. 2025-10-31 20:00:57.203727 : Computing Co-Accessibility chrX (23 of 23), 1.646 mins elapsed. ArchR logging successful to : ArchRLogs/ArchR-addCoAccessibility-91a8736e9148-Date-2025-10-31_Time-19-59-18.421382.log
code cell
In [133]:
cA <- getCoAccessibility( ArchRProj = projHeme5, corCutOff = 0.5, resolution = 1, returnLoops = FALSE )
code cell
In [134]:
cA
Result:
DataFrame with 166850 rows and 11 columns queryHits subjectHits seqnames correlation Variability1 Variability2 <integer> <integer> <Rle> <numeric> <numeric> <numeric> 1 5 22 chr1 0.605403 0.00646632 0.08812707 2 5 24 chr1 0.591242 0.00646632 0.01572732 3 8 30 chr1 0.557246 0.00705825 0.00453338 4 8 42 chr1 0.559279 0.00705825 0.00917213 5 8 53 chr1 0.606811 0.00705825 0.03785115 ... ... ... ... ... ... ... 166846 129068 129062 chrX 0.548497 0.000829654 0.000494894 166847 129068 129071 chrX 0.625279 0.000829654 0.000993719 166848 129071 129058 chrX 0.632109 0.000993719 0.001208354 166849 129071 129062 chrX 0.623128 0.000993719 0.000494894 166850 129071 129068 chrX 0.625279 0.000993719 0.000829654 TStat Pval FDR VarQuantile1 VarQuantile2 <numeric> <numeric> <numeric> <numeric> <numeric> 1 16.5427 7.79188e-49 2.11657e-47 0.786511 0.999423 2 15.9439 4.10395e-46 9.98569e-45 0.786511 0.939097 3 14.5954 4.22879e-40 7.99728e-39 0.805353 0.708505 4 14.6729 1.93124e-40 3.70914e-39 0.805353 0.857611 5 16.6036 4.10729e-49 1.12745e-47 0.805353 0.990253 ... ... ... ... ... ... 166846 14.2666 1.15875e-38 2.05649e-37 0.239415 0.115521 166847 17.4256 6.81007e-53 2.16212e-51 0.239415 0.289439 166848 17.7414 2.34308e-54 7.84521e-53 0.289439 0.344420 166849 17.3275 1.93406e-52 6.03719e-51 0.289439 0.115521 166850 17.4256 6.81007e-53 2.16212e-51 0.289439 0.239415
code cell
In [136]:
cA <- getCoAccessibility( ArchRProj = projHeme5, corCutOff = 0.5, resolution = 1, returnLoops = TRUE )
code cell
In [138]:
cA[[1]]
Result:
GRanges object with 83425 ranges and 9 metadata columns: seqnames ranges strand | correlation Variability1 <Rle> <IRanges> <Rle> | <numeric> <numeric> [1] chr1 869931-933646 * | 0.605403 0.00646632 [2] chr1 869931-936022 * | 0.591242 0.00646632 [3] chr1 905435-940961 * | 0.557246 0.00705825 [4] chr1 905435-967007 * | 0.559279 0.00705825 [5] chr1 905435-1000160 * | 0.606811 0.00705825 ... ... ... ... . ... ... [83421] chrX 154730383-154735921 * | 0.556769 0.005865527 [83422] chrX 154730383-154738200 * | 0.629392 0.006515845 [83423] chrX 154730902-154762145 * | 0.548497 0.000494894 [83424] chrX 154730902-154799790 * | 0.623128 0.000494894 [83425] chrX 154762145-154799790 * | 0.625279 0.000829654 Variability2 TStat Pval FDR VarQuantile1 <numeric> <numeric> <numeric> <numeric> <numeric> [1] 0.08812707 16.5427 7.79188e-49 2.11657e-47 0.786511 [2] 0.01572732 15.9439 4.10395e-46 9.98569e-45 0.786511 [3] 0.00453338 14.5954 4.22879e-40 7.99728e-39 0.805353 [4] 0.00917213 14.6729 1.93124e-40 3.70914e-39 0.805353 [5] 0.03785115 16.6036 4.10729e-49 1.12745e-47 0.805353 ... ... ... ... ... ... [83421] 0.001167913 14.5773 5.07777e-40 9.57071e-39 0.766055 [83422] 0.005865527 17.6150 9.04491e-54 2.96547e-52 0.788074 [83423] 0.000829654 14.2666 1.15875e-38 2.05649e-37 0.115521 [83424] 0.000993719 17.3275 1.93406e-52 6.03719e-51 0.115521 [83425] 0.000993719 17.4256 6.81007e-53 2.16212e-51 0.239415 VarQuantile2 value <numeric> <numeric> [1] 0.999423 0.605403 [2] 0.939097 0.591242 [3] 0.708505 0.557246 [4] 0.857611 0.559279 [5] 0.990253 0.606811 ... ... ... [83421] 0.334468 0.556769 [83422] 0.766055 0.629392 [83423] 0.239415 0.548497 [83424] 0.289439 0.623128 [83425] 0.289439 0.625279 ------- seqinfo: 23 sequences from an unspecified genome; no seqlengths
code cell
In [139]:
cALoops <- cA[[1]] cALoops <- cALoops[cALoops$FDR < 10^-10] cALoops <- cALoops[rowMins(cbind(cALoops$VarQuantile1,cALoops$VarQuantile2)) > 0.35] cALoops
Result:
GRanges object with 51208 ranges and 9 metadata columns: seqnames ranges strand | correlation Variability1 <Rle> <IRanges> <Rle> | <numeric> <numeric> [1] chr1 869931-933646 * | 0.605403 0.00646632 [2] chr1 869931-936022 * | 0.591242 0.00646632 [3] chr1 905435-940961 * | 0.557246 0.00705825 [4] chr1 905435-967007 * | 0.559279 0.00705825 [5] chr1 905435-1000160 * | 0.606811 0.00705825 ... ... ... ... . ... ... [51204] chrX 154460076-154490796 * | 0.521831 0.00184088 [51205] chrX 154460076-154541392 * | 0.597480 0.00534552 [51206] chrX 154490796-154541392 * | 0.594713 0.01306984 [51207] chrX 154728098-154738200 * | 0.529508 0.00214815 [51208] chrX 154730383-154738200 * | 0.629392 0.00651584 Variability2 TStat Pval FDR VarQuantile1 <numeric> <numeric> <numeric> <numeric> <numeric> [1] 0.08812707 16.5427 7.79188e-49 2.11657e-47 0.786511 [2] 0.01572732 15.9439 4.10395e-46 9.98569e-45 0.786511 [3] 0.00453338 14.5954 4.22879e-40 7.99728e-39 0.805353 [4] 0.00917213 14.6729 1.93124e-40 3.70914e-39 0.805353 [5] 0.03785115 16.6036 4.10729e-49 1.12745e-47 0.805353 ... ... ... ... ... ... [51204] 0.01306984 13.3041 1.56668e-34 2.31273e-33 0.464842 [51205] 0.00184088 16.2048 2.69772e-47 6.88544e-46 0.746096 [51206] 0.00534552 16.0885 9.09022e-47 2.26934e-45 0.916135 [51207] 0.00651584 13.5754 1.10446e-35 1.71740e-34 0.509330 [51208] 0.00586553 17.6150 9.04491e-54 2.96547e-52 0.788074 VarQuantile2 value <numeric> <numeric> [1] 0.999423 0.605403 [2] 0.939097 0.591242 [3] 0.708505 0.557246 [4] 0.857611 0.559279 [5] 0.990253 0.606811 ... ... ... [51204] 0.916135 0.521831 [51205] 0.464842 0.597480 [51206] 0.746096 0.594713 [51207] 0.788074 0.529508 [51208] 0.766055 0.629392 ------- seqinfo: 23 sequences from an unspecified genome; no seqlengths
code cell
In [140]:
p17 <- plotBrowserTrack( ArchRProj = projHeme5, groupBy = "Clusters2", geneSymbol = markerGenes, upstream = 50000, downstream = 50000, loops = getCoAccessibility(projHeme5) )
Output:
ArchR logging to : ArchRLogs/ArchR-plotBrowserTrack-91a826687ad8-Date-2025-10-31_Time-20-05-42.333402.log If there is an issue, please report to github with logFile! 2025-10-31 20:05:42.57646 : Validating Region, 0.004 mins elapsed.
Output:
GRanges object with 7 ranges and 2 metadata columns: seqnames ranges strand | gene_id symbol <Rle> <IRanges> <Rle> | <character> <character> [1] chr11 69641087-69654474 + | 595 CCND1 [2] chr10 68230624-68232103 - | 220202 ATOH7 [3] chr3 115623324-115721490 + | 2596 GAP43 [4] chr15 52756989-52791078 - | 3175 ONECUT1 [5] chr6 10393186-10419659 - | 7020 TFAP2A [6] chr6 106086320-106109939 + | 639 PRDM1 [7] chr14 24080107-24115014 - | 4901 NRL ------- seqinfo: 24 sequences from hg38 genome
Output:
2025-10-31 20:05:42.64059 : Adding Bulk Tracks (1 of 7), 0.005 mins elapsed. 2025-10-31 20:05:43.824628 : Adding Feature Tracks (1 of 7), 0.025 mins elapsed. 2025-10-31 20:05:43.888833 : Adding Loop Tracks (1 of 7), 0.026 mins elapsed. 2025-10-31 20:05:44.191854 : Adding Gene Tracks (1 of 7), 0.031 mins elapsed. 2025-10-31 20:05:44.418705 : Plotting, 0.035 mins elapsed. 2025-10-31 20:05:45.343415 : Adding Bulk Tracks (2 of 7), 0.05 mins elapsed. 2025-10-31 20:05:46.402727 : Adding Feature Tracks (2 of 7), 0.068 mins elapsed. 2025-10-31 20:05:46.466118 : Adding Loop Tracks (2 of 7), 0.069 mins elapsed. 2025-10-31 20:05:46.623292 : Adding Gene Tracks (2 of 7), 0.072 mins elapsed. 2025-10-31 20:05:46.85263 : Plotting, 0.075 mins elapsed. 2025-10-31 20:05:47.710006 : Adding Bulk Tracks (3 of 7), 0.09 mins elapsed. 2025-10-31 20:05:48.892102 : Adding Feature Tracks (3 of 7), 0.109 mins elapsed. 2025-10-31 20:05:48.949689 : Adding Loop Tracks (3 of 7), 0.11 mins elapsed. 2025-10-31 20:05:48.986819 : Adding Gene Tracks (3 of 7), 0.111 mins elapsed. 2025-10-31 20:05:49.211452 : Plotting, 0.115 mins elapsed. 2025-10-31 20:05:50.044354 : Adding Bulk Tracks (4 of 7), 0.129 mins elapsed. 2025-10-31 20:05:50.997637 : Adding Feature Tracks (4 of 7), 0.144 mins elapsed. 2025-10-31 20:05:51.059533 : Adding Loop Tracks (4 of 7), 0.145 mins elapsed. 2025-10-31 20:05:51.143282 : Adding Gene Tracks (4 of 7), 0.147 mins elapsed. 2025-10-31 20:05:51.368527 : Plotting, 0.151 mins elapsed. 2025-10-31 20:05:52.161704 : Adding Bulk Tracks (5 of 7), 0.164 mins elapsed. 2025-10-31 20:05:53.374133 : Adding Feature Tracks (5 of 7), 0.184 mins elapsed. 2025-10-31 20:05:53.439228 : Adding Loop Tracks (5 of 7), 0.185 mins elapsed. 2025-10-31 20:05:53.7358 : Adding Gene Tracks (5 of 7), 0.19 mins elapsed. 2025-10-31 20:05:53.986343 : Plotting, 0.194 mins elapsed. 2025-10-31 20:05:54.981452 : Adding Bulk Tracks (6 of 7), 0.211 mins elapsed. 2025-10-31 20:05:56.207274 : Adding Feature Tracks (6 of 7), 0.231 mins elapsed. 2025-10-31 20:05:56.266981 : Adding Loop Tracks (6 of 7), 0.232 mins elapsed. 2025-10-31 20:05:57.065695 : Adding Gene Tracks (6 of 7), 0.246 mins elapsed. 2025-10-31 20:05:57.28871 : Plotting, 0.249 mins elapsed. 2025-10-31 20:05:58.116256 : Adding Bulk Tracks (7 of 7), 0.263 mins elapsed. 2025-10-31 20:05:59.111555 : Adding Feature Tracks (7 of 7), 0.28 mins elapsed. 2025-10-31 20:05:59.175051 : Adding Loop Tracks (7 of 7), 0.281 mins elapsed. 2025-10-31 20:06:00.651995 : Adding Gene Tracks (7 of 7), 0.305 mins elapsed. 2025-10-31 20:06:00.875915 : Plotting, 0.309 mins elapsed. ArchR logging successful to : ArchRLogs/ArchR-plotBrowserTrack-91a826687ad8-Date-2025-10-31_Time-20-05-42.333402.log
code cell
In [141]:
grid::grid.draw(p17$CCND1)
Image output
code cell
projHeme5 <- saveArchRProject(ArchRProj = projHeme5, outputDirectory = "Save-ProjHeme5", load = TRUE)